god the prime intellect RL residents have been cooking so hard a major bottleneck in continual learning is that we don't have a general way to compare and evaluate methods across task domains i think @carnot_cyclist may have solved this
i won't spoil it because i want him to write a banger blog post about it. but wow it's just a really really clean formalism that can be used for so many different things, and he's got some nice early experimental results to show it off
155