Author: Yu Tang

The Rise of Agentic Retrieval Over Vector Indexing


The successful stories of Claude Code have shown that you can skip heavyweight vector databases and let the model itself handle retrieval with simple tools: well-written llms.txt and grep calls. Surprisingly, this minimalist approach delivers more accurate and faster retrieval, proving that a reasoning-driven model can outperform embedding-based systems in both retrieval precision and latency.

We take that same principle beyond code.

PageIndex is an LLM-native, vectorless index for PDFs and long-form documents — a hierarchical table-of-contents tree that lives inside the model’s context window, enabling the model to reason and navigate like a human reader. No vector DB required.


The RAG Pipeline We All Know, and Increasingly Use Less

The classic RAG pipeline works like this:

split content into chunks → embed → store in a vector DB → semantic search → (blend results with keyword search) → (rerank) → stuff the context → answer.

It works, but it’s complex to build, hard to maintain, and slow to iterate, often more infrastructure than you really need.

In contrast, a new wave of code agents takes a refreshingly simple approach:

Benchmark comparison of different retrieval methods from Lance Martin’s blog post

Benchmark comparison of different retrieval methods from Lance Martin’s blog post

In real-world tests on developer docs, practitioners have found that a well-crafted llms.txt (URLs + succinct descriptions) plus simple tool calls, such as grep, outperforms vector-DB pipelines for various coding tasks. It’s not only more accurate, but also dramatically easier to maintain and update.

Why does this minimalist approach work so well?