RAG vs Wiki

A comparison of traditional Retrieval-Augmented Generation (semantic search RAG), agentic search, and the LLM Wiki Pattern for knowledge management.

The Key Variable: Data Structure

The “RAG is dead” narrative is misleading. The real question is: how structured is your data?

Data TypeBest RetrievalWhy
Code (structured)Agentic search (ripgrep, glob, file navigation)Exact identifiers, file paths, terminal tools
Personal knowledge (semi-structured)LLM Wiki (index files, wikilinks)Pre-compiled synthesis, explicit cross-references
Large unstructured knowledge basesSemantic search RAG (vector DB)Synonym matching, conceptual similarity, scale

LLM Wiki vs Semantic Search RAG

AspectLLM WikiSemantic Search RAG
SearchReads index files, follows wikilinksEmbedding similarity search
InfrastructureMarkdown files onlyEmbedding model + vector DB + chunking pipeline
CostToken usage onlyOngoing compute + storage
MaintenanceLint passes + source additionsRe-embedding when data changes
Scale ceilingHundreds of pagesMillions of documents
Relationship depthDeep — explicit links and cross-referencesShallow — chunk-level similarity
Knowledge persistenceCompiled once, updated incrementallyRe-derived on every query
Setup time~5 minutesHours to days

Why RAG Died for Code

Cole Medin and Claude Code maintainer Boris Cherny explain: Claude Code originally used RAG with a local vector DB, then abandoned it because agentic search works better for code. Three reasons:

  1. Exact identifiers — code is perfectly spelled; keyword/regex search finds what you need
  2. Built-in organization — file structure provides natural navigation
  3. Terminal tools — ripgrep, glob, cat are powerful and fast for structured data

Aider takes a different approach: tree-sitter generates a structural index of the codebase (files, classes, functions) as a high-level overview for the agent. Not vector embeddings — just a map.

Nick Pash (Cline creator): “The RAG narrative is a mind virus” for coding agents — causes overengineering by applying semantic search where it’s unnecessary.

Why RAG Lives for Unstructured Data

The Star Wars test: Searching “Star Wars spaceships” via keyword won’t find paragraphs about X-wings or TIE fighters. Embedding models capture that semantic similarity. This is what vector DBs excel at — and what agentic search cannot do.

For large knowledge bases (customer support, compliance, legal, enterprise docs):

  • ~100x cheaper than agentic search at scale (small chunks vs reading whole files)
  • Semantic matching handles ambiguous natural language, synonyms, conceptual similarity
  • Keyword/regex search cannot find buried conceptual references

CAG: The Third Approach

Context Augmented Generation (per AI Jason) is the third entry in this comparison: instead of retrieving chunks (RAG) or following structured links (wiki), load the entire dataset into the model’s context window and let the model do the relevance work itself. Only viable now because long-context models (Gemini 2.0 with 1M+ tokens) and collapsing per-token prices ($0.01/M input on Gemini 2.0 Flash) inverted the cost calculus.

AspectLLM WikiRAGCAG
Best forPersonal KBMassive unstructured corporaBounded datasets that fit context
InfrastructureMarkdown onlyEmbedding model + vector DBLong-context model + cache
Per-query costTokens for index + pageEmbedding + vector lookup + LLMOne LLM call with full corpus
Tuning surfacePage conventions, lint passesChunk size, embedding, rerankerPrompt + corpus selection
Failure modeStale indexBad retrievalContext overflow

The macro view: RAG was a bridge technology that papered over context-window scarcity. With cheap long context, RAG’s reason-to-exist shrinks to “the dataset is larger than any feasible context window.” See CAG for the full pattern.

The Bridge Approach

Emerging best practice: give agents both retrieval tools (semantic search + agentic search) and let them decide per-query which strategy fits the data.

When to Use Which

LLM Wiki for: personal knowledge bases, research projects (dozens to hundreds of sources), cases where synthesis and cross-referencing matter.

Agentic search for: code, structured data with exact identifiers and file organization.

Semantic search RAG for: enterprise-scale unstructured collections (millions of docs), customer support, compliance, legal — anything requiring synonym matching at scale.

OpenBrain for: personal semantic memory across AI tools — uses vector embeddings (PGVector) but with an MCP interface rather than a traditional RAG pipeline.

Key Stat

One user reported a 95% reduction in token usage after converting 383 scattered files and 100+ meeting transcripts from a traditional approach to a structured wiki.

See Also