The easiest way to reduce next-token prediction loss: make the thing you are predicting more like you.
2.62K