Tencent Open-Sources TencentDB Agent Memory, a 4-Tier Local Memory Pipeline That Cuts Agent Token Usage by 61%

Tencent has released TencentDB Agent Memory, an MIT-licensed memory system designed to solve one of the most persistent problems in production AI agents: context bloat and recall failure across long-running sessions. The project, published on GitHub and covered by MarkTechPost, ships as an OpenClaw plugin and a Hermes Agent Docker image with zero external API dependencies.

How the 4-Tier Pyramid Works

Most agent memory systems dump data into flat vector stores and rely on similarity search to recall fragments. TencentDB Agent Memory replaces that approach with a four-layer semantic pyramid:

L0 Conversation: Raw dialogue logs.
L1 Atom: Atomic facts extracted from conversations, stored as JSONL.
L2 Scenario: Scene blocks grouping related facts into coherent contexts, stored as Markdown.
L3 Persona: A synthesized user profile carrying day-to-day preferences.

The system queries the Persona layer first and drills down to lower layers only when finer detail is needed. Lower layers preserve evidence; upper layers preserve structure. Storage is heterogeneous: facts and traces go into SQLite with the sqlite-vec extension for full-text retrieval, while personas, scenes, and task canvases are stored as human-readable Markdown files.

Symbolic Short-Term Memory

For in-session context, the system addresses a different problem: verbose tool logs, search results, and error traces that consume tokens rapidly during long tasks. TencentDB Agent Memory offloads full tool logs to external files under refs/*.md and encodes state transitions into compact Mermaid syntax inside a lightweight task canvas.

The agent reasons over the Mermaid symbol graph in its context window. When it needs raw text, it retrieves the corresponding file using a node_id reference. According to the GitHub documentation, this creates a “deterministic drill-down from top-layer symbol to mid-layer index to bottom-layer raw text.”

Benchmark Results

Tencent published benchmark results measured over continuous long-horizon sessions, not isolated turns. On SWE-bench, which runs 50 consecutive tasks per session to simulate context-accumulation pressure:

WideSearch: Pass rate rose from 33% to 50% (+51.52% relative). Token usage dropped from 221.31M to 85.64M, a 61.38% reduction.
SWE-bench: Success climbed from 58.4% to 64.2%. Tokens fell from 3,474.1M to 2,375.4M, a 33.09% reduction.
PersonaMem (long-term): Accuracy rose from 48% to 76%.

These numbers come from Tencent’s own evaluations and have not been independently replicated.

Integration and Developer Surface

The OpenClaw integration ships as a single npm package: @tencentdb-agent-memory/memory-tencentdb. It requires Node.js 22.16 or higher and can be enabled with one config flag. The plugin handles conversation capture, memory extraction, scene aggregation, persona generation, and recall automatically.

For Hermes Agent, a Docker image bundles the agent, plugin, and memory gateway. The default model is Tencent Cloud’s DeepSeek-V3.2, though any OpenAI-compatible endpoint works through a custom provider flag. Two tools are exposed to agents during sessions: tdai_memory_search and tdai_conversation_search, both returning references with node_id fields for traceback.

Retrieval defaults to a hybrid strategy combining BM25 keyword search with vector embeddings, fused using Reciprocal Rank Fusion (RRF). The BM25 tokenizer supports both Chinese and English. Default settings trigger an L1 memory extraction every five turns and generate a persona every 50 new memories. Recall returns five items by default with a 5-second timeout; on timeout, the system skips injection rather than blocking the conversation.

The Infrastructure Play

The MIT license signals Tencent’s bet on adoption across both the OpenClaw and Hermes ecosystems. By commoditizing the memory layer, Tencent is positioning itself as infrastructure rather than competing at the model or framework level. The project directly competes with proprietary memory solutions from LangChain and Anthropic’s built-in memory offerings. For agent builders currently paying for external vector databases like Pinecone or Weaviate, a fully local alternative that runs on SQLite changes the cost calculus for persistent agent deployments.

Tencent Open-Sources TencentDB Agent Memory, a 4-Tier Local Memory Pipeline That Cuts Agent Token Usage by 61%

How the 4-Tier Pyramid Works

Symbolic Short-Term Memory

Benchmark Results

Integration and Developer Surface

The Infrastructure Play

Get our morning briefing in your inbox

Keep Reading

Barret Zoph Exits OpenAI for Second Time After Five Months as Enterprise Head

Yahoo DSP Launches Agent Network With 30+ Partners Across Ad-Tech Workflow

Omdia: Agentic AI Is Forcing AWS, Google, and Microsoft to Redesign Their Cloud Infrastructure