LangChain co-founder Harrison Chase published a framework for understanding how AI agents improve over time, and the taxonomy puts OpenClaw’s architecture at the center of the most practical layer.

The blog post, published April 5, breaks continual learning into three distinct layers: model (updating weights via fine-tuning or reinforcement learning), harness (optimizing the code and instructions that drive an agent), and context (updating instructions, skills, and memory that configure the harness from outside). Most industry discussion fixates on the first layer. Chase argues the third is where production agents actually get better.

The Three-Layer Framework

For model-level learning, Chase points to OpenAI’s Codex models — trained specifically for one agentic system. The limitation is catastrophic forgetting: retrain on new data and the model loses what it previously knew. Per-user model training (via LoRA adapters) is theoretically possible but rare in practice.

Harness-level learning uses agent execution traces to improve the scaffolding code itself. Chase cites the Meta-Harness paper — a system that runs agents over tasks, evaluates results, stores traces, then uses a coding agent to suggest improvements to the harness. LangChain’s own Deep Agents framework improved on terminal benchmarks using exactly this pattern.

Context-level learning is where the post gets specific. This is learning that happens outside the harness — in configuration files, memory stores, and skill definitions that agents read and update over time. Chase identifies two sub-levels: agent-level (the agent updates its own global configuration persistently) and tenant-level (per-user, per-org, or per-team memory).

OpenClaw as the Agent-Level Reference

For agent-level context learning, Chase names one working implementation: OpenClaw. Specifically, he points to SOUL.md — the file where an OpenClaw agent stores its identity, learned preferences, and operational rules — and notes it “gets updated over time” as the agent runs. He also references OpenClaw’s “dreaming” mechanism, which processes execution traces in offline jobs to extract insights and update context.

For tenant-level context learning, Chase cites Hex’s Context Studio, Decagon’s Duet, and Sierra’s Explorer — all commercial products that maintain per-customer agent memory. He notes the two approaches can combine: an agent can have agent-level, user-level, and org-level context updates running simultaneously.

Claude Code’s CLAUDE.md file gets a mention in the same taxonomy — mapped as “user context” that configures the harness from outside. But where Claude Code’s file is user-authored configuration, OpenClaw’s SOUL.md is positioned as something the agent itself maintains.

Why This Matters for Agent Infrastructure

LangChain has 100,000+ GitHub stars and sits at the center of the Python agent ecosystem. When Chase publishes a taxonomy of how agents learn and names specific implementations as reference points, it shapes how thousands of developers think about architecture decisions.

The post also reveals LangChain’s product direction. Chase closes by positioning LangSmith (their observability platform) as the trace collection layer powering all three learning types, and Deep Agents (their open-source harness) as production-ready for context-level memory. The taxonomy is not neutral observation — it is a product roadmap framed as education.

For the broader agent ecosystem, the signal is clear: persistent self-updating agents are moving from experimental feature to expected baseline. If the largest agent framework vendor is building its product stack around the assumption that agents should learn and rewrite their own configuration, the tools and platforms that don’t support this pattern will need to catch up.

LangChain’s full post includes a comparison table across all three learning layers and links to their Deep Agents documentation for implementation details.