auto researching memory eval for a custom pi agent, VERY cool framework so far i'm also testing out this /autoresearch claude code skill (link in replies) apparently it turns agents into the complete auto researcher pipeline i'm EVALing it against real memory datasets with real agent runs, specific towards how well they can remember a codebase will report results
OK CLAUDE BOI !!!
this is so cool
883