I don't think people have realized how crazy the results are from this new TTT + RL paper from Stanford/Nvidia. Training an open source model, they - beat Deepmind AlphaEvolve, discovered new upper bound for Erdos's minimum overlap problem - Developed new A100 GPU kernels 2x faster than the best human kernel - Outperformed the best AI coding attempt and human attempt on AtCoder The idea of Test Time Training is to train a model *while* it's iteratively trying to solve a task. Combining this with RL like they do in this paper opens up the floodgates of possibilities for continual learning Authors: @mertyuksekgonul @LeoXinhaoLee @JedMcCaleb @xiaolonw @jankautz @YejinChoinka @james_y_zou @guestrin @sun_yu_