Even a wrong answer is right some of the time
AI models often produce false outputs, or "hallucinations." Now OpenAI has admitted they may result from fundamental mistakes it makes when training its models.
The admission came in a paper [PDF] published in early September, titled "Why Language Models Hallucinate," and penned by three OpenAI researchers and Santosh Vempala, a distinguished professor of computer science at Georgia Institute of Technology. It concludes that "the majority of mainstream evaluations reward hallucinatory behavior."
Language models are primarily evaluated using exams that penalize uncertainty
The fundamental problem is that AI models are trained to reward guesswork, rather than the correct answer. Guessing might produce a superficially suitable answer. Telling users your AI can't find an answer is less satisfying. //
"Over thousands of test questions, the guessing model ends up looking better on scoreboards than a careful model that admits uncertainty," OpenAI admitted in a blog post accompanying the release.