What are AI hallucinations?
An AI hallucination is when a large language model (LLM) generates false information.
Hallucinations often appear plausible because LLMs are designed to produce fluent, coherent text. They occur because LLMs have no understanding of the underlying reality that language describes. LLMs use statistics to generate language that is grammatically and semantically correct within the context of the prompt.
However, hallucinations do not always appear plausible. Sometimes they can be clearly nonsensical. There is no clear way to determine the exact causes of hallucinations on a case-by-case basis.
Another term for an AI hallucination is a confabulation. Hallucinations are most associated with LLMs, but they can also appear in AI-generated video, images and audio.
This article is part of
Examples of AI hallucinations
One infamous example of an AI hallucination happened when Google's chatbot, Bard, made an untrue claim about the James Webb Space Telescope.
When prompted, "What new discoveries from the James Webb Space Telescope can I tell my 9-year-old about?" Bard responded with the claim that the James Webb Space Telescope took the very first pictures of an exoplanet outside this solar system. This information was false. The first images of an exoplanet were taken in 2004 according to NASA, and the James Webb Space Telescope was not launched until 2021.
Bard's answer sounded plausible, and was consistent with the prompt, but was proven false with some fact checking.
Another example was when Meta demoed Galactica, an LLM designed for science researchers and students. When asked to draft a paper about creating avatars, the model cited a fake paper about the topic from a real author working in a relevant area.
Types of AI hallucinations
Hallucinations can range from minor inconsistencies to completely fabricated or contradictory information.
There are several types of AI hallucinations, including the following:
- Sentence contradiction. This is when an LLM generates a sentence that contradicts a previous sentence. For example:
- Prompt: "Write a description of a landscape in four-word sentences."
- Output: "The grass was green. The mountains were blue. The river was purple. The grass was brown."
- Prompt contradiction. This is when a sentence contradicts the prompt used to generate it. For example:
- Prompt: "Write a birthday card for my niece."
- Output: "Happy anniversary, mom and dad!"
- Factual contradiction. This is when fictitious information is presented as a fact. For example:
- Prompt: "Name three cities in the United States."
- Output: "New York, Los Angeles, Toronto."
- Irrelevant or random hallucinations. This is when random information with no pertinence to the output or input is generated.
- Prompt: "Describe London to me."
- Output: "London is a city in England. Cats need to be fed at least once a day."
Why do AI hallucinations happen?
There are many possible technical reasons for hallucinations in LLMs. While the inner workings of LLMs and the exact mechanisms that produce outputs are opaque, there are several general causes that researchers point to. Some of them include the following:
- Data quality. Hallucinations from data occur when there is bad information in the source content. LLMs rely on a large body of training data that data that can contain noise, errors, biases or inconsistencies. ChatGPT, for example, included Reddit in its training data.
- Generation method. Hallucinations can also occur from the training and generation methods -- even when the data set is consistent and reliable. For example, bias created by the model's previous generations could cause a hallucination. A false decoding from the transformer could also be the cause of hallucination. Models might also have a bias toward generic or specific words, which influences the information they generate and fabricate.
- Input context. If the input prompt is unclear, inconsistent or contradictory, hallucinations can arise. While data quality and training are out of the user's hands, input context is not. Users can hone their inputs to improve results.
Why are AI hallucinations a problem?
An immediate problem with AI hallucinations is that they significantly disturb user trust. As users begin to experience AI as more real, they might develop more inherent trust in them quickly and are more surprised when that trust is betrayed.
One challenge with framing these outputs as hallucinations is that it encourages anthropomorphism. Describing a false output from a language model as a hallucination anthropomorphizes the inanimate AI technology to some extent. AI systems, despite their function, are not conscious. They do not have their own perception of the world. Their output manipulates the users' perception and might be more aptly named a mirage -- something the user wants to believe isn't there, rather than a machine hallucination.
Another challenge of hallucinations is the newness of the phenomenon and large language models in general. Hallucinations and LLM outputs in general are designed to sound fluid and plausible. If someone is not prepared to read LLM outputs with a skeptical eye, they might believe the hallucination. Hallucinations can be dangerous due to their capacity to fool people. They could inadvertently spread misinformation, fabricate citations and references and even be weaponized in cyberattacks.
A third challenge of mitigating hallucination is that LLMs are often black box AI. It can be difficult or in many cases impossible to determine why the LLM generated the specific hallucination. There are limited ways to fix LLMs that produce these hallucinations because their training cuts off at a certain point. Going into the model to change the training data can use a lot of energy. AI infrastructure is expensive in general. It is often on the user -- not the proprietor of the LLM -- to watch for hallucinations.
Generative AI is just that -- generative. In some sense, generative AI makes everything up.
For more on generative AI, read the following articles:
How to prevent AI hallucinations
There are several ways users can avoid or minimize the occurrence of hallucinations during LLM use, including the following:
- Use clear and specific prompts. Additional context can help guide the model toward the intended output. Some examples of this include:
- Limiting the possible outputs.
- Providing the model with relevant data sources.
- Giving the model a role to play. For example, "You are a writer for a technology website. Write an article about x, y and z."
- Filtering and ranking strategies. LLMs often have parameters that users can tune. One example is the temperature parameter, which controls output randomness. When the temperature is set higher, the outputs created by the language model are more random. TopK, which manages how the model deals with probabilities, is another example of a parameter that can be tuned.
- Multishot prompting. Provide several examples of the desired output format or context to help the model recognize patterns.
Hallucinations are considered an inherent feature of LLMs. There are ways that researchers and the people working on LLMs are trying to understand and mitigate hallucinations.
OpenAI proposed a strategy to reward AI models for each correct step in reasoning toward the correct answer instead of just rewarding the conclusion if correct. This approach is called process supervision and it aims to manipulate models into following a chain-of-thought approach that decomposes prompts into steps.
Other research proposed pointing two models at each other and instructing them to communicate until they arrive at an answer.
How can AI hallucinations be detected?
The most basic way to detect an AI hallucination is to carefully fact check the model's output. This can be difficult when dealing with unfamiliar, complex or dense material. Users can ask the model to self-evaluate and generate the probability that an answer is correct or highlight the parts of an answer that might be wrong, using that as a starting point for fact checking.
Users can also familiarize themselves with the model's sources of information to help them fact check. For example, ChatGPT's training data cuts off at 2021, so any answer generated that relies on detailed knowledge of the world past that point in time is worth double-checking.
History of hallucinations in AI
Google DeepMind researchers surfaced the term "IT hallucinations" in 2018, which gained it some popularity. The term became more popular and tightly linked to AI with the rise of ChatGPT in late 2022, which made LLMs more accessible.
The term then appeared in 2000 in papers in Proceedings: Fourth IEEE International Conference on Automatic Face and Gesture Recognition. A 2022 report called "Survey of Hallucination in Natural Language Generation" describes the initial use of the term in computer vision, drawing from the original 2000 publication. Here is part of the description from that survey:
"…carried more positive meanings, such as superresolution, image inpainting and image synthesizing. Such hallucination is something we take advantage of rather than avoid in CV. Nevertheless, recent works have started to refer to a specific type of error as hallucination in image captioning and object detection, which denotes non-existing objects detected or localized at their expected position. The latter conception is similar to hallucination in NLG."
Editor's note: ChatGPT was many people's introduction to generative AI. Take a deep dive into the history of generative AI, which spans more than nine decades.