Author: Yu Tang
The successful stories of Claude Code have shown that you can skip heavyweight vector databases and let the model itself handle retrieval with simple tools: well-written llms.txt and grep calls. Surprisingly, this minimalist approach delivers more accurate and faster retrieval, proving that a reasoning-driven model can outperform embedding-based systems in both retrieval precision and latency.
We take that same principle beyond code.
PageIndex is an LLM-native, vectorless index for PDFs and long-form documents — a hierarchical table-of-contents tree that lives inside the model’s context window, enabling the model to reason and navigate like a human reader. No vector DB required.
The classic RAG pipeline works like this:
split content into chunks → embed → store in a vector DB → semantic search → (blend results with keyword search) → (rerank) → stuff the context → answer.
It works, but it’s complex to build, hard to maintain, and slow to iterate, often more infrastructure than you really need.
In contrast, a new wave of code agents takes a refreshingly simple approach:

Benchmark comparison of different retrieval methods from Lance Martin’s blog post
In real-world tests on developer docs, practitioners have found that a well-crafted llms.txt (URLs + succinct descriptions) plus simple tool calls, such as grep, outperforms vector-DB pipelines for various coding tasks. It’s not only more accurate, but also dramatically easier to maintain and update.
Why does this minimalist approach work so well?