- If you continue the METR trend, you see ~100h models by end of the year! (~8x more powerful than now) - METR will really struggle to have the benchmarks needed to assess models of that power - We can no longer rule out significant automation of AI development THIS YEAR
Ajeya Cotra
Ajeya CotraMar 5, 23:17
New post: on Jan 14, I predicted that SWE time horizon by EOY would be ~24 hours. Now I think it'll be >100 hours, and maybe unbounded. For the first time, I don't see solid evidence against AI R&D automation *this year.* Link below.
@Douglas_Schon The mean ratio p80/p50 is ~0.19... it's remarkably stable.
@djinnius @microfounded @eli_lifland I also have a Substack
147