DApp Store | Web3 Hub for Events & Games

Trending topics

It took a few years of deep thinking, but I'm super excited to finally share PROSPER: a beautiful, regression-based algorithm for RL from *rubric rewards* that robustly handles the *inconsistent feedback* that LLM judges provide. Let's go Back to Black(well)! 🧵(1/n)

Top

Ranking

Favorites