this is the most important chart in the world, and it's going absolutely ballistic
METR
METR10 hours ago
We estimate that Claude Opus 4.6 has a 50%-time-horizon of around 14.5 hours (95% CI of 6 hrs to 98 hrs) on software tasks. While this is the highest point estimate we’ve reported, this measurement is extremely noisy because our current task suite is nearly saturated.
linear version is completely bonkers. superexponential
for the uninitiated, this means: (certain) tasks that would ordinarily take a human user 14 hours to complete, can now be done by AI (in a much shorter time, likely minutes) with a 50% probability that's 4.5 orders of magnitude since 2019, or a 26,000x increase since GPT-2
25