Trending topics
#
Bonk Eco continues to show strength amid $USELESS rally
#
Pump.fun to raise $1B token sale, traders speculating on airdrop
#
Boop.Fun leading the way with a new launchpad on Solana.
new research on 445 ai benchmarks
• 48% disagree on what they measure
• 39% use convenient, not correct, data
• 16% test statistical significance
we still don't know how to measure our most powerful tools
IMO treat evals like sports, not the SAT
competition > tests
clear rules -> human-understandable results


Top
Ranking
Favorites
