*Designing a RAG system that stays factual + handles conflicts*
RAG reduces hallucinations by grounding the LLM in retrieved docs. But garbage in = garbage out. Here's how to architect it for accuracy:
*1. Retrieval: Get the right stuff, not just stuff*
*Better chunking & indexing*
- *Semantic chunking*: Split by meaning, not arbitrary token counts. Keep headers, tables, and list items together.
- *Hybrid retrieval*: Combine `vector search` for meaning + `BM25/keyword` for exact terms. Vector-only misses "iPhone 15 Pro Max" when docs say "iPhone 15 Pro".
- *Metadata filtering*: Tag chunks with date, source authority, doc type. Then filter: `date > 2024` or `source = official_docs`.
- *Query rewriting*: Expand the user query before retrieval. "RAG hallucination fixes" → also search "RAG factuality methods" + "RAG conflict resolution".
*Source quality control*
- *Whitelist domains*: Only index trusted sources. Wikipedia + ArXiv > random blog.
- *Recency weighting*: Boost newer docs for time-sensitive topics. Deprecate old chunks automatically.
- *Deduplication*: Cluster near-identical chunks so one bad source doesn't dominate.
*2. Generation: Force the LLM to stay grounded*
*Constrained prompting*
- *"Answer only from context"*: System prompt: `If the context doesn't contain the answer, say "I don't know". Don't use outside knowledge.`
- *Cite inline*: `Answer with [Source 1] after each claim.` Forces the model to tie statements to chunks. If it can't cite, it shouldn't say it.
*Self-verification loops*
1. Generate answer with citations
2. Run a second pass: `For each sentence, check if [Source X] actually supports it. Remove if unsupported.`
3. Or use NLI models to flag claims not entailed by retrieved text.
*Structured output*
- Ask for JSON: `{"answer": "...", "sources": [1,3], "confidence": "high/med/low"}`
- If `sources: []` or `confidence: "low"`, return "Not enough info in retrieved docs".
*3. Handling conflicting information*
This is where most RAG systems fall apart. Don't hide conflicts — expose them.
*Conflict detection at retrieval*
- Embed chunks + run clustering. If top-K results form 2+ distinct clusters, flag conflict.
- Or use an LLM: `Do these sources agree? Answer: Yes/No + explain.`
*Conflict resolution strategies*
Strategy When to use Example output
**Present both** No clear authority `Source A says the release is Q2 2026. Source B says Q3 2026.`
**Prefer authority** You have trust scores `According to the official spec [1], it's 4GB. Blog [2] claims 8GB.`
**Temporal override** Dated facts conflict `As of Jan 2026 [1], the limit is 100. Older doc [2] says 50.`
**Synthesize with caveat** Minor differences `Estimates range 10-12M [1][2]. Most recent data suggests 11.3M [1].`
*Key implementation detail*: Pass source metadata into the prompt. `Source 1: CDC.gov, 2026-03-15 | Source 2: healthblog.com, 2023`. The LLM can then reason about trust + recency.
*4. System-level safeguards*
1. *Eval pipeline*: Build a test set of questions with known answers. Measure `citation precision` = % of cited claims actually supported. Track hallucination rate over time.
2. *Fallback hierarchy*: `High-confidence RAG answer` → `Low-confidence + "unverified"` → `I don't know`. Never guess.
3. *Human feedback loop*: Log cases where users downvote answers. Feed those back to re-rank retriever or flag bad chunks.
4. *Calibrated refusal*: If top-K similarity scores < threshold, don't generate. Just say "No relevant docs found".
*The brutal truth*
You can't get to 0% hallucinations with current LLMs. The goal is `auditable answers`. Every claim should trace to a chunk. If you can't point to where the model got it, don't show it to the user.
Best combo I've seen: `Hybrid retrieval + reranker model` → `LLM with citation-forcing prompt` → `NLI verification pass` → `Conflict-aware synthesis`.