🚨"人类的最后考试"发布:2500个问题区分真实的AI与伪装者 X刚刚揭示了终极学术挑战——一个如此全面的基准,旨在成为最后一次所需的测试。 数学占41%,其次是科学和人文学科。 名字说明了一切:这是终结所有考试的考试。一旦AI通过了这个测试,还有什么需要证明的呢? 我们正在构建这个测试,以确定机器何时正式超越我们。 来源:@xai @elonmusk
Mario Nawfal
Mario Nawfal7月10日 12:12
🚨GROK'S "LUDICROUS" PROGRESS: 10X IMPROVEMENTS WITH EACH VERSION X just dropped the receipts on Grok's evolution. Each generation delivers 10x better performance across the board - from basic predictions to advanced reasoning. Grok 4's reasoning capabilities dwarf everything before it. The exponential growth curve looks like a rocket launch. While others inch forward, Grok multiplies. This is what compound technological progress actually looks like. The AI race just got interesting. Source: @xai @elonmusk
41