Trending topics
#
Bonk Eco continues to show strength amid $USELESS rally
#
Pump.fun to raise $1B token sale, traders speculating on airdrop
#
Boop.Fun leading the way with a new launchpad on Solana.

swyx
achieve ambition with intentionality, intensity, & integrity
- @smol_ai
- @dxtipshq
- @sveltesociety
- @aidotengineer
- @coding_career
- @latentspacepod
the reason llm analysis (and regulation, and PMing) is hard*
is that the relevant DIMENSIONS keep moving with each generation of frontier model; it is not enough to just put your x or y axis in log scale and track scaling laws, you have to actually do the work to think about how models are structurally different in 2025 vs 2024 vs 2023 and so on
eg
everyone focused on elo for 2 years, elo gets gamed and loses credibility
everyone focused on price per tokens for 3 years, reasoning models have 10-40x variation in output tokens per task, price per token loses meaning
collect data all you want but if you are just collecting pristine time series you can lose sight of the bigger picture
*(and why statements like “ai engineer is not a thing because all software engineers are ai engineers” are cope and will never be right except in the most trivial sense)

Scott Huston11 hours ago
Is there a public spreadsheet of all the leading LLM models from different companies showing their pricing, benchmark scores, arena elo scores etc?
5.68K
swyx reposted
🆕 Releasing our entire RL + Reasoning track!
featuring:
• @willccbb, Prime Intellect
• @GregKamradt, Arc Prize
• @natolambert, AI2/Interconnects
• @corbtt, OpenPipe
• @achowdhery, Reflection
• @ryanmart3n, Bespoke
• @ChrSzegedy, Morph
with special 3 hour workshop from:
@danielhanchen of Unsloth!
start here:
Happy weekend watching! and thanks to @OpenPipeAI for supporting and hosting this track!

105.8K
swyx reposted
if, as @sgrove proposes, specs are the code of the future, then what is debugging?
1) spec compilation is the process of a coding agent turning specs into code
2) more and more “compilation” will be unattended, less watching the agent work diff by diff, more spec in, code out
3) type errors -> truth errors : most debugging will be digging through research and implementation plans in markdown to find the one line of incorrect context that makes the coding agent fail to succeed when implementing. Test suites will, among other things, check for truth and logical consistency.
4) there is a new higher order flavor of “attaching a step debugger” which is watching the agent implement a plan step by step to pinpoint the logic error in the spec. When you find an error when stepping through a program line by line, you change the code, restart the process, and repeat until it’s working. When you find an error in a *spec* while stepping through an implementation, you go upstream, fix the spec, and restart the *implementation*
10.03K
we're releasing one track a day from the @aidotengineer conf now*. yesterday's RecSys track was a big hit - but by far the hottest track was our coverage of the state of MCP, hosted by @Calclavia
personal fave slide is this where i realized @AnthropicAI dogfoods MCP -way- harder than i initially thought from our podcast with @dsp_ and @jspahrsummers
take a look at these talks and give your fave speakers a shoutout!
*most already available as "unlisted" via the "Complete Playlist" if you search

21.66K
"Three things: a deep research model with enhanced search browser; a revolutionary computer-use operator; and a sandboxed terminal to execute math and code. A browser, a computer, a terminal… are you getting it?
These are not three separate devices.
This is one device, and we are calling it Agent."

377
if you havent tried the Chrome + iMessage + Apple Notes + Linear + Gmail + GCal DXT integrations in Claude you are missing out the literal LLM OS evolution
Smarter Siri is here; it's just called Claude Desktop


Alex AlbertJun 27, 2025
We've simplified local MCP usage by creating something new we call Desktop Extensions (.dxt files).
These package your local server, handle dependencies, and provide secure configuration so you can one-click share and install local servers on Claude Desktop and other apps.

121.59K
Top
Ranking
Favorites
Trending onchain
Trending on X
Recent top fundings
Most notable