AI hallucinations are incorrect or false responses given by generative AI models.
After reading this article you will be able to:
Copy article link
Artificial intelligence (AI) hallucinations are falsehoods or inaccuracies in the output of a generative AI model. Often these errors are hidden within content that appears logical or is otherwise correct. As usage of generative AI and large language models (LLMs) has become more widespread, many cases of AI hallucinations have been observed.
The term "hallucination" is metaphorical — AI models do not actually suffer from delusions as a mentally unwell human might. Instead they produce unexpected outputs that do not correspond to reality in response to prompts. They may misidentify patterns, misunderstand context, or draw from limited or biased data to get those unexpected outputs.
Some documented examples of AI hallucinations include:
While AI has a number of use cases and real-world applications, in many cases, AI models' tendency to hallucinate means they cannot be entirely relied upon without human oversight.
All AI models are made up of a combination of training data and an algorithm. An algorithm, in the context of AI, is a set of rules that lay out how a computer program should weight or value certain attributes. AI algorithms contain billions of parameters — the rules on how attributes should be valued.
Generative AI needs training data because it learns by being fed millions (or billions, or trillions) of examples. From these examples, generative AI models learn to identify relationships between items in a data set — typically by using vector databases that store data as vectors, enabling the models to quantify and measure the relationships between data items. (A "vector" is a numerical representation of different data types, including non-mathematical types like words or images.)
Once the model has been trained, it keeps refining its outputs based on the prompts it receives. Its developers will also fine-tune the model for more specific uses, continuing to change the parameters of the algorithm, or using methods like low-rank adaptation (LoRA) to quickly adjust the model to a new use.
Put together, the result is a model that can respond to prompts from humans by generating text or images based on the samples it has seen.
However, human prompts can vary greatly in complexity and cause unexpected behavior by the model, since it is impossible to prepare it for every possible prompt. And, the model may misunderstand or misinterpret the relationships between concepts and items even after extensive training and fine-tuning. Unexpected prompts and misperceptions of patterns can lead to AI hallucinations.
Sources of training data: It is hard to vet training data because AI models need so much that a human cannot review all of it. Unreviewed training data may be incorrect or weighted too heavily in a certain direction. Imagine an AI model that is trained to write greeting cards, but its training data set ends up containing mostly birthday cards, unbeknownst to its developers. As a result, it might generate happy or funny messages in inappropriate contexts, such as when prompted to write a "Get well soon" card.
Inherent limits of generative AI design: AI models use probability to "predict" which words or visual elements are likely to appear together. Statistical analysis can help a computer create plausible-seeming content — content that has a high probability of being understood by humans. But statistical analysis is a mathematical process that may miss some of the nuances of language and meaning, resulting in hallucinations.
Lack of direct experience of the physical world: Today's AI programs are not able to detect whether something is "true" or "false" in an external reality. While a human could, for example, conduct experiments to determine if a scientific principle is true or false, AI currently can only train itself on preexisting content, not directly on the physical universe. It therefore struggles to tell the difference between accurate and inaccurate data, especially in its own responses.
Struggle to understand context: AI only looks at literal data and may not understand cultural or emotional context, leading to irrelevant responses and AI hallucinations. Satire, for example, may confuse AI (even humans often confuse satire with fact).
Bias: The training data used may lead to built-in bias if the data set is not broad enough. Bias can simply skew AI models towards giving certain kinds of answers, or it can even lead to promoting racial or gender stereotypes.
Attacks on the model: Malicious persons can use prompt injection attacks to alter the way generative AI models perceive prompts and produce results. A highly public example occurred in 2016, when Microsoft launched a chatbot, Tay, that within a day started generating racist and sexist content due to Twitter (now X) users feeding it information that distorted its responses. AI models have become more sophisticated since then but are still vulnerable to such attacks.
Overfitting: If an AI model is trained too much on its initial training data set, it can lose the ability to generalize, detect trends, or draw accurate conclusions from new data. It may also detect patterns in its training data that are not actually significant, leading to errors that are not apparent until it is fed new data. These scenarios are called "overfitting": the models fit too closely with their training data. As an example of overfitting, during the COVID-19 pandemic, AI models trained on scans of COVID patients in hospitals started picking up on the text font that the different hospitals used, and treating the font as a predictor of COVID diagnosis. For generative AI models, overfitting can lead to hallucinations.
While developers may not be able to eliminate AI hallucinations completely, there are concrete steps they can take to make hallucinations and other inaccuracies less likely.
Learn how Cloudflare for AI helps developers build and run AI models from anywhere in the world. And discover how Cloudflare Vectorize enables developers to generate and store embeddings in a globally distributed vector database.