Tuesday, 5 May 2026

What are real examples of a math hallucination and how to catch it?

Here are 3 real patterns of math hallucinations that keep showing up, plus exactly how to catch them:

*1. The fake citation / invented theorem*

*What the AI said*:  
> "By the _Chen-Zhang Theorem (2019)_, any integer > 2 can be written as the sum of a prime and a power of 2. Therefore 7 = 3 + 2²."

*Why it’s a hallucination*: There is no Chen-Zhang Theorem from 2019. The statement it’s “proving” is actually related to Goldbach’s conjecture and is still unsolved. The AI invented both the theorem and the citation to sound authoritative.

*How to catch it*: 
- *Google the exact theorem name + year*: If only the AI result shows up, it’s fake.
- *Check arXiv/Google Scholar*: Real theorems have papers. Search `arXiv Chen Zhang 2019 prime power 2`.
- *Ask for a link*: “Give me the DOI or arXiv ID.” Hallucinations usually reply with broken links or “I made an error.”

*2. The confident wrong arithmetic*

*What the AI said*:
> "To integrate, $\int_0^1 x^2 dx = \left[\frac{x^3}{3}\right]_0^1 = \frac{1^3}{3} - \frac{0^3}{3} = \frac{1}{3}$. Now multiply by 9: $9 \times \frac{1}{3} = 2.7$."

*Why it’s a hallucination*: $9 \times \frac{1}{3} = 3$, not 2.7. The AI did the calculus right but slipped on grade-school arithmetic, then doubled down with steps.

*How to catch it*:
1. *Sanity check simple ops*: Round numbers. $9 \times 0.33...$ should be ∼3, not 2.7.
2. *Re-run in a calculator*: Paste `9 * (1/3)` into your phone. 
3. *Force code output*: Prompt “Solve this and show Python code to verify.” If the code gives 3.0, the text was wrong.

*3. The broken proof with hidden divide-by-zero*

*What the AI said*:
> "Let $a = b$.  
> Then $a^2 = ab$  
> $a^2 - b^2 = ab - b^2$  
> $(a-b)(a+b) = b(a-b)$  
> Divide both sides by $(a-b)$: $a + b = b$  
> Since $a = b$, then $2b = b$, so $2 = 1$."

*Why it’s a hallucination*: Step 4 divides by $(a-b)$, but since $a = b$, that’s $(b-b) = 0$. You can’t divide by zero. The AI presented it as valid algebra.

*How to catch it*:
1. *Check every division*: Ask “What are you dividing by and can it be zero?” 
2. *Plug in numbers*: Let $a=5, b=5$. The step $(a-b)(a+b) = b(a-b)$ becomes $0 \times 10 = 5 \times 0$ which is true. But dividing gives $10 = 5$, false.
3. *Use a proof assistant*: Tools like Lean or Wolfram won’t let that step pass.

*General “catch it” checklist*
Red flag Quick test
**Named theorem you’ve never heard of** Google `"Theorem name" math` with quotes
**arXiv ID or DOI given** Paste it into http://arxiv.org/abs/ID or http://doi.org/ID
**Arithmetic in the middle of a proof** Recalculate that line by hand/code
**“It follows trivially that...”** That’s where steps get skipped. Ask “show the trivial step”
**Too-perfect answer** If it solves an open problem like Riemann Hypothesis, it’s 99.999% hallucinated
*Best defense*: Treat AI like a student showing work. Don’t grade the final answer — grade each line. For anything that matters, run it through Wolfram Alpha, SymPy, or just a calculator.


No comments:

What are real examples of a math hallucination and how to catch it?

Here are 3 real patterns of math hallucinations that keep showing up, plus exactly how to catch them: *1. The fake citation / invented theor...