As an academic, I am sympathetic as publishing takes awhile and it is hard to keep up with frontier models, but... ...especially if your argument is "AI is bad at X" you need to explain why you think it won't change, graph any trend as models improve & update before publication
Kevin Roose
Kevin Roose12 hours ago
i am begging academics to study AI capabilities using frontier models. the models used in this study (which is going to be cited for years as proof that "AI is bad at health advice") are GPT-4o, Llama 3, and Command R+, two obsolete models and one i've never heard of.
The paper actually has two big real points, however: (1) Humans were bad at prompting (obsolete) AI to get medical advice - I suspect this is no longer as true (2) Benchmarks of medical knowledge don't always mean reality in serving patients. 1 has changed, I think, 2 has not
203