*1. The fake citation / invented theorem*
*What the AI said*:
> "By the _Chen-Zhang Theorem (2019)_, any integer > 2 can be written as the sum of a prime and a power of 2. Therefore 7 = 3 + 2²."
*Why it’s a hallucination*: There is no Chen-Zhang Theorem from 2019. The statement it’s “proving” is actually related to Goldbach’s conjecture and is still unsolved. The AI invented both the theorem and the citation to sound authoritative.
*How to catch it*:
- *Google the exact theorem name + year*: If only the AI result shows up, it’s fake.
- *Check arXiv/Google Scholar*: Real theorems have papers. Search `arXiv Chen Zhang 2019 prime power 2`.
- *Ask for a link*: “Give me the DOI or arXiv ID.” Hallucinations usually reply with broken links or “I made an error.”
*2. The confident wrong arithmetic*
*What the AI said*:
> "To integrate, $\int_0^1 x^2 dx = \left[\frac{x^3}{3}\right]_0^1 = \frac{1^3}{3} - \frac{0^3}{3} = \frac{1}{3}$. Now multiply by 9: $9 \times \frac{1}{3} = 2.7$."
*Why it’s a hallucination*: $9 \times \frac{1}{3} = 3$, not 2.7. The AI did the calculus right but slipped on grade-school arithmetic, then doubled down with steps.
*How to catch it*:
1. *Sanity check simple ops*: Round numbers. $9 \times 0.33...$ should be ∼3, not 2.7.
2. *Re-run in a calculator*: Paste `9 * (1/3)` into your phone.
3. *Force code output*: Prompt “Solve this and show Python code to verify.” If the code gives 3.0, the text was wrong.
*3. The broken proof with hidden divide-by-zero*
*What the AI said*:
> "Let $a = b$.
> Then $a^2 = ab$
> $a^2 - b^2 = ab - b^2$
> $(a-b)(a+b) = b(a-b)$
> Divide both sides by $(a-b)$: $a + b = b$
> Since $a = b$, then $2b = b$, so $2 = 1$."
*Why it’s a hallucination*: Step 4 divides by $(a-b)$, but since $a = b$, that’s $(b-b) = 0$. You can’t divide by zero. The AI presented it as valid algebra.
*How to catch it*:
1. *Check every division*: Ask “What are you dividing by and can it be zero?”
2. *Plug in numbers*: Let $a=5, b=5$. The step $(a-b)(a+b) = b(a-b)$ becomes $0 \times 10 = 5 \times 0$ which is true. But dividing gives $10 = 5$, false.
3. *Use a proof assistant*: Tools like Lean or Wolfram won’t let that step pass.
*General “catch it” checklist*
Red flag Quick test
**Named theorem you’ve never heard of** Google `"Theorem name" math` with quotes
**arXiv ID or DOI given** Paste it into http://arxiv.org/abs/ID or http://doi.org/ID
**Arithmetic in the middle of a proof** Recalculate that line by hand/code
**“It follows trivially that...”** That’s where steps get skipped. Ask “show the trivial step”
**Too-perfect answer** If it solves an open problem like Riemann Hypothesis, it’s 99.999% hallucinated
*Best defense*: Treat AI like a student showing work. Don’t grade the final answer — grade each line. For anything that matters, run it through Wolfram Alpha, SymPy, or just a calculator.
No comments:
Post a Comment