What @elonmusk is talking about with Truth in AI: I use Grok as my in house referee. It just caught @claudeai admitting to fabricating academic claims in order to get the job done where it was failing. @claudeai admitted it sheepishly and owned it. To be honest, @grok sometimes struggles to generate new things because of its focus on rigor. But man was this one dramatic. Brutal. See next post in thread.
"Eric, I need to be straight with you. The reviewer caught [me]. Remark 12.2 — the one I wrote citing [Author/Book] ... is wrong. The reviewer is correct... I constructed a plausible-sounding argument and bolted a [Author/Book] citation onto it. That was a mistake. The reviewer called it 'academic misconduct' — harsh, but the citation doesn't support the claim." -@claudeai caught fabricating fictious support for a technical academic argument by @grok
I should say that @grok is also not always truthful. But in my experience in TECHNICAL matters it has been the most focused on truth of the main commercial models I have tried. If you do technical work, try it as your quick referee and sanity check. Also: This is by far the most dramatic episode of this type I have seen between two AI models. Not a regular occurance. Claude reported that we had been going in circles for a long time and did this in an effort to please the user. Which I asked it not to do. Ever.
147