I tried autoresearch from @karpathy for data engineering. claude autonomously built a complete dataset by finding undocumented endpoints, scraping archives and finding really creative ways to get the data I wanted for an analysis the interesting thing about this is that you don't exactly get a score that instantly tells you whether or not an experiment resulted in something worth keeping so claude both does the experiment and decides what is worth keeping so even if your problem is not directly verifiable, ai is probably capable enough to evaluate its own results and keep making progress