#Reducing Hallucinations¶

A hallucination is confident, fluent, wrong output. You can't eliminate it, but you can engineer it down dramatically.

Why Models Hallucinate¶

Recall Module 1: an LLM predicts plausible next tokens, not true ones. With no grounding and a question beyond its knowledge, the most "plausible-sounding" continuation is often a confident fabrication.

The Hierarchy of Fixes (most → least effective)¶

1Ground with retrieval (RAG) — give it the facts to read instead of recall (Module 4). Biggest single lever.
2Use tools for exact tasks — calculator, code execution, DB lookups via tool calling. Don't ask the model to be a database.

3Give an explicit "I don't know" path:

text

3 lines

1If the answer is not supported by the context, respond exactly:
2"I don't have enough information to answer that."
3Do not speculate.

4Demand citations — "cite the source id for each claim." Unsupported claims become visible and verifiable.
5Chain-of-thought / verification — ask it to reason, then check its own answer against the context before finalising.
6Lower temperature — for factual tasks, temperature 0 reduces creative drift.

Self-Verification Pattern¶

text

5 lines

1Step 1: Draft an answer.
2Step 2: For each factual claim, quote the exact supporting
3        sentence from the context. If you cannot quote
4        support, delete the claim.
5Step 3: Output only the verified answer.

What Does Not Reliably Work¶

Myth	Reality
"Just tell it: do not hallucinate"	Weak on its own — it doesn't know when it's wrong
"Bigger model = no hallucination"	Reduces, never eliminates
"High confidence wording = correct"	Confidence ≠ accuracy in LLMs

Practical Recipe¶

RAG for knowledge + tools for computation + explicit "I don't know" + required citations + temperature 0 + an eval set that measures hallucination rate.

Combine them. No single trick is sufficient; the stack is.

Mindset: Don't ask "how do I make it never lie?" Ask "how do I make wrong answers rare, visible, and verifiable?"

The Hierarchy of Fixes (most → least effective)¶

1Ground with retrieval (RAG) — give it the facts to read instead of recall (Module 4). Biggest single lever.

2Use tools for exact tasks — calculator, code execution, DB lookups via tool calling. Don't ask the model to be a database.

3Give an explicit "I don't know" path:

text

3 lines

1If the answer is not supported by the context, respond exactly:
2"I don't have enough information to answer that."
3Do not speculate.

4Demand citations — "cite the source id for each claim." Unsupported claims become visible and verifiable.

5Chain-of-thought / verification — ask it to reason, then check its own answer against the context before finalising.

6Lower temperature — for factual tasks, temperature 0 reduces creative drift.

1Step 1: Draft an answer. 2Step 2: For each factual claim, quote the exact supporting 3 sentence from the context. If you cannot quote 4 support, delete the claim. 5Step 3: Output only the verified answer.

Myth

Reality

"Just tell it: do not hallucinate"

Weak on its own — it doesn't know when it's wrong

"Bigger model = no hallucination"

Reduces, never eliminates

"High confidence wording = correct"

Confidence ≠ accuracy in LLMs

Practical Recipe¶

RAG for knowledge + tools for computation + explicit "I don't know" + required citations + temperature 0 + an eval set that measures hallucination rate.

Combine them. No single trick is sufficient; the stack is.

Mindset: Don't ask "how do I make it never lie?" Ask "how do I make wrong answers rare, visible, and verifiable?"