Hallucinations

Hallucinations occur when a language model generates content that is confidently wrong, fabricated, or misleading, despite sounding plausible. These outputs are not grounded in factual data or provided context.

Why It Happens

LLMs are trained to predict the next token based on patterns—not to verify truth. They can confidently generate:

  • Nonexistent facts
  • Incorrect citations
  • Fabricated functions, APIs, or sources
  • Incoherent logic masked by fluent language

Types of Hallucinations

  • Factual: Asserting incorrect or unverifiable information
  • Contextual: Ignoring or misinterpreting provided input
  • Structural: Generating formats or schemas that don’t exist (e.g., fake JSON keys)
  • Confabulation: Filling in gaps with invented detail to maintain coherence

Triggers

  • Vague prompts
  • Overly open-ended tasks
  • Ambiguous or contradictory input
  • Long output generations
  • Lack of grounding (e.g., no RAG or retrieval)

Mitigation Strategies

  • Use retrieval-augmented generation (RAG) to ground responses in real data
  • Fine-tune or supervise with factual datasets
  • Ask models to cite sources or explain reasoning
  • Use confidence thresholds and human-in-the-loop review
  • Apply prompt constraints to guide behavior

Real-World Risks

  • Misinformation in enterprise tools
  • False legal or medical claims
  • Fabricated academic citations
  • Code that compiles but is semantically invalid