Getty Images/iStockphoto

Tip

Why does AI hallucinate, and can we prevent it?

Despite their sophistication, LLMs still often produce inaccurate or misleading output. Will these hallucinations ever go away for good?

You don't need to use generative AI for long before encountering one of its major weaknesses: hallucination.

Hallucinations occur when a large language model generates false or nonsensical information. With the current state of LLM technology, it doesn't appear possible to eliminate hallucinations entirely. However, certain strategies can reduce the risk of hallucinations and minimize their effects when they do occur.

To address the hallucination problem, start by understanding what causes LLMs to hallucinate, then learn practical techniques to mitigate those issues.

What is an LLM hallucination?

An LLM hallucination is any output from an LLM that is false, misleading or contextually inappropriate.

Most LLMs can produce different types of hallucinations. Common examples include the following:

  • Inaccurate information. The model generates factually incorrect statements, such as "The capital of the U.S. is London."
  • Conflicting information. The output contains contradictions, such as "John is a teenager who is 67 years old."
  • Input/output mismatch. The model's response doesn't align with the prompt -- for example, asking for a cookie recipe and getting directions to a grocery store. Even if the directions are accurate, they don't answer the user's question.

Hallucinations can be especially harmful in high-stakes settings where accuracy and reliability are essential, such as financial analysis or applications involving human safety.

LLM hallucination causes

LLM technology has advanced rapidly in recent years. In the late 2010s, machine learning developers were still experimenting with basic generative AI models. Today, LLM-powered generative AI is widely adopted across industries.

Yet hallucinations remain a persistent challenge -- and likely always will, as long as generative AI relies on LLMs. This is because hallucinations stem from inherent limitations of LLMs: training data constraints, context window limitations, attention issues and overfitting.

1. Training data constraints

LLMs can only generate information based on what they've been exposed to through their training data -- but they are also incentivized to always respond to a user's question. Consequently, if a user prompts a model with a question not covered in its training data, the model might generate a false or nonsensical response.

2. Context window limitations

Transformer-based LLMs process input as a series of linguistic components called tokens, which are typically word- or syllable-length units of text. The number of tokens a model can process simultaneously is called its context window. The larger the context window, the more information the model can consider at once, reducing the likelihood of hallucinations.

However, expanding a model's context window requires significantly more compute. For this reason, most LLMs operate with limited context windows. This can cause hallucinations if the prompt and the information needed to adequately interpret it don't fully fit within the available context.

3. Attention issues

Attention mechanisms help LLMs decide which tokens to prioritize. If a model makes incorrect assumptions about which tokens are more important than others, hallucinations can result.

For example, in the prompt "What is the significance of Washington, D.C.?" a model with a malfunctioning attention mechanism might overemphasize Washington and generate an answer about George Washington, rather than the city. Avoiding hallucinations requires balanced attention to the full context of a query.

4. Overfitting

Overfitting happens when a model memorizes specific examples from its training data instead of learning generalized patterns to produce new insights. This inability to generalize can cause hallucinations in the form of misleading or incomplete responses.

For instance, if the phrase roses are red appears frequently in a model's training data, an overfitted model might default to this answer when asked, "What color are roses?" Although roses come in several colors -- and the training data might even contain references to roses in those colors -- overfitting leads to a predominant association between roses and red.

The challenge of detecting hallucinations

In real-world use, AI hallucinations are often much more subtle than the examples listed above. For instance, a coding LLM might introduce bugs by referencing nonexistent versions of real packages. Or a model summarizing a book might misstate certain aspects of the argument without being entirely incorrect.

This subtlety means that, in practice, hallucinations can be hard to catch. As a result, one of the biggest challenges in mitigating them is simply having enough human oversight and domain expertise to detect when they happen.

How to mitigate LLM hallucination risks

Because hallucinations are caused by core limitations in how LLMs function, eliminating them entirely is unlikely. However, developers can take several steps to reduce the frequency and severity:

  • More training data. Increasing the volume of training data helps reduce hallucinations by giving the model more chances to encounter information that is relevant to a prompt.
  • Higher-quality training data. Even without expanding the overall data set size, improving training data quality can lower hallucination risk by making sure that the model learns from accurate, well-structured sources.
  • Larger context window size. Expanding a model's context window enables it to consider more information at once, which generally improves accuracy. However, this comes with higher compute costs, as processing more tokens requires more resources.
  • Retrieval-augmented generation. The RAG technique supplements an LLM's internal knowledge base by enabling the model to access external sources, such as company documentation or a vetted database, at query time. This is especially effective for domain-specific use cases, such as providing answers based on a company's internal data.
  • Fine-tuning. Fine-tuning tailors an LLM to a particular task or domain by training it on additional, targeted data. Like RAG, fine-tuning can reduce hallucinations in domain-specific contexts. However, it can backfire and increase hallucinations if the fine-tuning data is low-quality or overly narrow.
  • Output filtering. Model output monitoring can flag and block inaccurate or unreliable responses before they are delivered to an end user. Monitoring can be manual, in the form of human review, or automated, with one LLM interpreting the responses of another.
  • Prompt engineering. Well-structured, contextually rich prompts reduce ambiguity and guide the model toward more accurate answers. Training users in effective prompt engineering techniques -- such as minimizing irrelevant context and clarifying intent -- can help mitigate hallucinations at the source.

Will hallucinations ever go away?

LLM developers have made progress in reducing hallucinations. For example, a study in the Journal of Medical Internet Research found that hallucination rates dropped by about 28% between GPT-3.5 and GPT-4. Still, GPT-4's hallucination rate stood at 28.6% -- hardly a reassuring figure for organizations seeking a hallucination-proof LLM.

Findings like these suggest that hallucination rates might decline gradually over time as models improve, but they will almost certainly never reach zero. Achieving that would require the following:

  • Training data that accurately covers all possible prompts.
  • Unlimited context window sizes.
  • Perfectly functioning attention mechanisms.
  • Complete immunity to overfitting.

The first two are simply impossible. No data set can anticipate every possible prompt, and infinite context windows would require unlimited compute power. Improvements in model design could reduce attention and overfitting issues, but it's unrealistic to expect that they could be eliminated completely.

Arguably, the only way to fully eliminate hallucinations is to use -- or invent -- alternative forms of AI, such as deterministic systems that always return the same response to the same input. A significant drawback of these systems is that typically they can only produce responses to a limited range of questions. However, as long as those responses are accurate, there is no hallucination risk.

Chris Tozzi is a freelance writer, research adviser, and professor of IT and society who has previously worked as a journalist and Linux systems administrator.

Dig Deeper on AI technologies