Hermes Agent's Self-Improving Skill Loop Crystallizes the Architectural Split With OpenClaw

Nous Research shipped a feature this week that no other open-source agent framework has: a closed learning loop that writes, stores, and refines its own reusable skills after completing complex tasks. The Hermes Agent, launched in February 2026 under an MIT license, now automatically detects patterns after roughly five or more tool calls, pauses to introspect on what worked, generates a Markdown skill file capturing the workflow, and reuses that skill on subsequent runs, according to coverage from eWEEK and Geeky Gadgets.

The update arrived the same week Hermes overtook OpenClaw as the most actively used open-source agent on OpenRouter’s global daily rankings, processing 224 billion daily tokens to OpenClaw’s 186 billion, according to MarkTechPost. The timing is not coincidental. The learning loop is the specific technical capability driving that adoption shift.

Two Theories of How Agents Should Evolve

The divergence between Hermes and OpenClaw runs deeper than features: the two frameworks disagree on whether agents should learn autonomously or remain under explicit human control.

OpenClaw is organized around a central WebSocket Gateway that routes conversations across 50+ messaging channels to an agent runtime. Its skill system is marketplace-driven: 44,000+ community-authored skills available through ClawHub, each manually configured and explicitly installed by the user. The agent executes what it is told to execute. Memory is Markdown-based, human-authored, and human-maintained. The architecture optimizes for reach, orchestration, and multi-agent coordination.

Hermes takes the opposite position. Its core execution loop follows a “do, learn, improve” cycle, as described in Nous Research’s GitHub repository. After completing a complex task involving multiple tool calls, the agent enters a reflective phase: it reviews its own execution, identifies reusable patterns, and autonomously generates a skill file. That skill is stored locally, loaded on subsequent tasks, and refined further with each use. The architecture optimizes for compounding value over time.

Composio’s technical comparison puts the distinction plainly: “OpenClaw is the better control plane. Hermes is the better self-improving runtime.” Or, as their evaluation framework describes it: “OpenClaw felt like a company. Hermes felt like one operator with temporary contractors.”

Inside the Learning Loop

The self-improving skill system operates through what Tencent Cloud’s technical documentation describes as a three-layer memory architecture.

The first layer is working memory: the current session context, volatile and session-scoped. The second is episodic memory: cross-session facts and preferences stored permanently in a SQLite database with FTS5 full-text search. The third is procedural memory: auto-created reusable skills that persist permanently and iterate with each use.

The skill generation itself follows a specific sequence. The agent observes a task, executes it using available tools, enters a reflection phase where it evaluates its own performance, crystallizes the successful pattern into a Markdown skill file with parameters and tool-call sequences, and stores it for future reuse. According to Tencent Cloud’s analysis, the system uses a three-level progressive loading strategy to manage token costs: Level 1 loads only the skill name and description (~20 tokens), Level 2 adds parameter specs (~200 tokens), and Level 3 loads the full execution steps (~1,000+ tokens). The agent loads only what it needs for the current task.

This is not template reuse. The agent dynamically adjusts variables within skills based on new context parameters. A skill generated from building a financial dashboard with quarterly revenue data will adapt its approach when asked to build a dashboard with monthly user metrics, without generating an entirely new skill.

The system is compatible with the agentskills.io open standard, meaning skills generated by one Hermes instance can be shared across deployments.

The Security Contrast

Scale has cost OpenClaw on the security front. CVE-2026-25253, assigned a CVSS score of 8.8, exposed OpenClaw’s gateway to remote exploitation through missing origin validation and rate limiting. In a four-day window in March 2026, nine CVEs were disclosed, with one scoring 9.9. A Koi Security audit of 2,857 ClawHub skills found 341 malicious entries, according to MarkTechPost. The marketplace model that gives OpenClaw its breadth also creates an attack surface that Hermes avoids entirely: Hermes has no centralized skill registry for adversaries to target.

Hermes is not clean. NVD lists multiple CVEs published April 27-29, 2026, including CVE-2026-7113, a missing authentication issue in the webhooks endpoint of version 0.8.0. But its self-generated skill model sidesteps the supply-chain risk inherent in community marketplaces. The v0.13.0 release, shipped May 7, addressed 8 P0 security issues including redaction-by-default, guild-scoped Discord role allowlists, and TOCTOU patches across authentication and MCP OAuth flows, according to MarkTechPost.

The tradeoff is clear. OpenClaw’s 44,000-skill marketplace gives users immediate capability but requires trust in third-party code. Hermes generates capabilities from the user’s own workflows, eliminating the supply-chain vector but limiting the breadth of available skills to what the agent has learned.

Where Each Architecture Wins

The architectural split maps to specific use cases, as Composio’s analysis documents from hands-on testing.

OpenClaw excels at multi-agent coordination. Persistent agent teams, cross-session state, and channel-bound agent identities make it the stronger choice when the problem is orchestrating multiple agents across platforms. One agent watches Slack, another handles Telegram, a third monitors email, a fourth coordinates across them. That structure is OpenClaw’s core strength.

Hermes excels at repeated automations that should improve over time. Daily reports, content pipelines, research loops, data collection, monitoring jobs, scheduled tasks. Composio describes the distinction: “Use OpenClaw when agents need to collaborate. Use Hermes when you need fast parallel execution under one controlling agent.”

Background execution also favors Hermes. OpenClaw’s persistent-agent architecture assumes a long-running process with rich in-memory state, which is harder to checkpoint to a remote server. Hermes is built stateless-by-default, with disk-first memory, meaning it can run on a $5 VPS and survive host restarts. The v0.13.0 release added gateway auto-resume after restart, reinforcing this deployment model.

The migration path is also telling. Hermes detects an existing ~/.openclaw directory during setup and offers to import settings, memories, skills, and API keys automatically via hermes claw migrate. The command supports dry-run previews, selective migration presets, and conflict overwrite controls. There is no equivalent OpenClaw command to import Hermes data. The migration flow is one-directional.

The Convergence Case

The binary framing oversimplifies what is actually happening in production. Both MarkTechPost and Geeky Gadgets report that many power users run both frameworks simultaneously: OpenClaw as the orchestrator and multi-channel router, Hermes as the learning loop that handles repeated execution. The two communicate via the Agent Communication Protocol (ACP), with OpenClaw dispatching tasks and Hermes executing them with compounding efficiency.

Tencent Cloud’s analysis frames this as complementary rather than competitive: “Use OpenClaw when coding in your IDE, and Hermes Agent for everything else.” The comparison table in Nous Research’s own documentation positions OpenClaw as excelling at “deep VS Code integration” while Hermes excels at “long-term collaboration” and meeting users “wherever you are.”

That framing serves Hermes more than OpenClaw. If Hermes is “everything else” and OpenClaw is IDE work, the addressable use case for Hermes is vastly larger.

The Release Velocity Gap

Hermes has shipped at a pace that few open-source projects sustain. Since its February 2026 launch, it has released v0.9.0 “Everywhere” (Android/Termux, iMessage, WeChat, 16 platforms), v0.11.0 “Interface” (React/Ink TUI rewrite, AWS Bedrock, NVIDIA NIM, 17th platform, 1,556 commits), and v0.13.0 “Tenacity” (Kanban multi-agent board, zombie detection, hallucination recovery, 20th platform, 864 commits), according to MarkTechPost.

OpenClaw, meanwhile, transitioned to an independent open-source foundation after founder Peter Steinberger joined OpenAI in February 2026. OpenAI sponsors the foundation, and the project announced LTS (long-term support) in May 2026. The shift to LTS signals stability and enterprise readiness, but it also signals a slower iteration cycle compared to Hermes’s biweekly major releases.

The GitHub stars tell a different story than the usage data. OpenClaw holds 370,000+ stars to Hermes’s 114,000+. But on daily inference volume, Hermes leads. Stars measure historical accumulation and awareness. Tokens measure current usage. The gap between those metrics suggests that OpenClaw’s install base is larger but less active on a per-user basis.

The Market Fracture

The open-source agent market is splitting along a fault line that mirrors a decades-old tension in software architecture: control versus autonomy, explicit configuration versus learned behavior, marketplace breadth versus individual depth.

OpenClaw bet on being the routing layer for everything. Hermes bet on being the system that gets better the longer you use it. Both bets are working, but for different populations of users and different categories of problems. The question for teams evaluating agent infrastructure in 2026 is not which one is better. It is which problem they are actually solving: coordination across surfaces, or compounding execution quality over time.

The learning loop is now the dividing line.

Hermes Agent's Self-Improving Skill Loop Crystallizes the Architectural Split With OpenClaw

Two Theories of How Agents Should Evolve

Inside the Learning Loop

The Security Contrast

Where Each Architecture Wins

The Convergence Case

The Release Velocity Gap

The Market Fracture

Get our morning briefing in your inbox

Keep Reading

Google Kills Project Mariner After 17 Months: How Browser Agents Lost the Architecture War to Code-Level AI

Nscale's $2 Billion Series C and the European Neocloud Buildout Reshaping AI Infrastructure

Agentic Commerce Is a $5 Trillion Opportunity. Fraudsters Are Already Building for It.