On April 3, Andrej Karpathy — OpenAI co-founder and former Director of AI at Tesla — posted on X describing how he manages research memory across long-running AI projects. The approach, which he calls LLM Knowledge Bases, skips vector databases entirely. Instead, the LLM itself maintains a growing Markdown wiki: reading raw inputs, compiling structured summaries, creating backlinks between concepts, and running periodic “linting” passes to find inconsistencies or missing connections.
As VentureBeat reported, Karpathy is solving a specific problem: the context-limit reset. Every time an agent session ends or a token limit is hit, the model loses its working understanding of the project. Developers have to re-explain architecture decisions, constraints, and accumulated context. Karpathy’s system keeps that context alive in files — human-readable, auditable, and persistent across sessions.
Why RAG Falls Short for Agents
Standard RAG works by chunking documents into fragments, encoding them as vectors, and retrieving the closest matches at query time. According to News Directory 3’s coverage, Karpathy argues that cosine similarity is a blunt instrument for agents operating over structured knowledge. A vector search retrieves content that sounds similar — but can miss the specific logical relationship or architectural constraint an agent needs to reason correctly. In a large codebase, this means retrieving plausible-looking snippets that don’t reflect the actual system state.
LLM Knowledge Bases sidestep this by making the LLM the author of the knowledge representation, not just a reader of it. The model doesn’t search an index. It reasons over structured text it helped write.
The Three-Stage Architecture
Per the News Directory 3 breakdown and VentureBeat’s analysis, the system runs in three stages:
- Data ingest — Raw materials (papers, repos, articles, notes) go into a
raw/directory. Karpathy uses Obsidian Web Clipper to convert web content to Markdown with locally stored images. - Compilation — The LLM reads the raw directory and writes a structured wiki: summaries, encyclopedia-style articles on key concepts, and explicit backlinks between related ideas.
- Linting — Periodic passes where the LLM scans the wiki for inconsistencies, gaps, and new connections. The knowledge base effectively repairs and extends itself over time.
Every claim the LLM makes can be traced to a specific Markdown file a human can read, edit, or delete. No opaque embedding vectors.
What This Means for Agent Builders
Karpathy describes his current setup as “a hacky collection of scripts.” That’s the entry point. The architectural idea — a persistent, LLM-maintained knowledge layer that outlives individual sessions — is directly applicable to any agent system that accumulates context over days or weeks.
For OpenClaw operators and agent builders, the immediate parallel is agent memory design. Most current approaches either rely on a vector store (with its retrieval limitations) or pass everything through a growing context window (which hits token ceilings fast). Karpathy’s system offers a third path: structured, authored, and self-maintaining. The knowledge base grows alongside the agent’s work and survives session resets without requiring re-initialization.
The builder community on X responded quickly. Developer @himanshustwts published a visual architecture diagram of the system within hours of Karpathy’s post, and a 10-agent “Swarm Knowledge Base” implementation using Hermes for quality validation was shared the same day, according to News Directory 3.