Gavriel Cohen spent seven years building software at Wix. When he launched his AI marketing agency Qwibit with his brother Lazer in late 2025, he plugged OpenClaw into their sales pipeline via WhatsApp. It worked. The agent tracked leads, assigned tasks, and delivered 9 AM briefings. Then Cohen looked at how it was doing all of that, and the architecture made him lose sleep.
“No isolation between agents, no access controls, all my WhatsApp messages stored in plain text,” Cohen told Forbes. “The agent I set up for sales could see my personal conversations. And the OpenClaw codebase was half a million lines generated in weeks without meaningful review.”
On January 31, 2026, Cohen released NanoClaw under an MIT license. Four months later, the project has crossed 20,000 GitHub stars, surpassed 100,000 downloads, and signed a partnership with Docker Inc. to integrate with Docker Sandboxes. The trajectory tells a story about where agent frameworks are headed as they move from developer toys to production infrastructure.
The Codebase Auditability Problem
The core technical critique is one of size. OpenClaw’s codebase approaches 500,000 lines of code with hundreds of dependencies. NanoClaw’s core logic is roughly 500 lines of TypeScript.
“As a developer, every open source dependency that we added to our codebase, you vet. You look at how many stars it has, who are the maintainers, and if it has a proper process in place,” Cohen explained to VentureBeat. “When you have a codebase with half a million lines of code, nobody’s reviewing that. It breaks the concept of what people rely on with open source.”
This is not a theoretical concern. Andrej Karpathy, the AI researcher and former Tesla AI director, highlighted the auditability advantage in a post on X: “The core engine is ~4000 lines of code (fits into both my head and that of AI agents, so it feels manageable, auditable, flexible, etc.) and runs everything in containers by default.”
The distinction between 4,000 lines (the full engine) and 500 lines (core logic) reflects NanoClaw’s layered architecture: a single-process Node.js orchestrator managing a per-group message queue with concurrency control, SQLite for persistence, and filesystem-based IPC. No distributed message brokers. No complex middleware. According to VentureBeat, the entire system can be audited by a human or a secondary AI in roughly eight minutes.
Container Isolation vs. Application-Level Security
The architectural divergence between OpenClaw and NanoClaw comes down to where security enforcement lives.
OpenClaw uses application-level controls: allowlists, pairing codes, and permission checks within the agent runtime. NanoClaw enforces isolation at the operating system level by placing every agent inside its own Linux container (Docker on Linux, Apple Containers on macOS).
“They’re running bare metal with some application level checks to try to prevent it from accessing things it shouldn’t access,” Cohen told The Register. “With NanoClaw, each agent runs in its own container. Inside that container it’s just the agentic loop. It’s just the Anthropic Agent SDK. And if you’re connecting it to your WhatsApp, that agent doesn’t see all of your WhatsApp data. It only has the group that that specific agent has been connected to.”
The difference matters most in multi-tenant scenarios. When an agency or enterprise runs agents for multiple clients, application-level isolation depends on the agent respecting boundaries set in code. Container isolation makes those boundaries physical. A prompt injection that escapes the agent’s intended behavior still cannot access data outside its container’s explicitly mounted directories.
“There’s always going to be a way out if you’re running directly on the host machine,” Cohen told VentureBeat. “In NanoClaw, the ‘blast radius’ of a potential prompt injection is strictly confined to the container and its specific communication channel.”
Docker’s Bet on Agent Sandboxing
The partnership between NanoClaw’s parent company NanoCo and Docker Inc. signals that the container ecosystem sees agent isolation as a growth category.
Mark Cavage, Docker’s president and COO, framed the challenge to TechTarget: “The infrastructure for the world needs to catch up with where AI agents are. Quite pointedly, agents break the container model. The ecosystem of containers assumes immutability: you build an image, you ship it, and you don’t touch it at runtime. But the very first thing an agent does is, it wants to go mutate its environment. It wants to install packages. It wants to modify files.”
Docker’s solution is Docker Sandboxes, an experimental feature in Docker Desktop that runs AI agents in microVMs on the local machine. NanoClaw’s integration means its containerized agents gain an additional isolation layer: even if an agent compromises its container, the microVM boundary prevents lateral movement to the host OS.
Torsten Volk, an analyst at Omdia (a division of Informa TechTarget), provided context for why this matters: “OpenClaw was developed as a POC by one guy who never intended it to go viral and be used in production. Because it was a POC he did not worry about optimizing the architecture for security, but optimized the entire OpenClaw platform for functionality, ease of use and simple extensibility.”
The Enterprise Adoption Data
Docker recently surveyed business executives and developers worldwide and found that 60% of organizations already have AI agents in production and 94% view building agents as a strategic priority, according to Forbes. The barriers to adoption: lack of enterprise readiness (45%), security (40%), and orchestration difficulties (33%).
The most common enterprise agent use cases reflect this caution: DevOps and CI/CD optimization (38%), security automation (35%), general process automation (34%), and code generation/review (31%). These are internal, controlled environments with clear blast radii. The pattern mirrors early cloud adoption, where internal workloads moved first and customer-facing applications followed only after governance matured.
NanoClaw’s growth tracks this demand. Its “Skills over Features” model rejects the traditional approach of bundling integrations into a monolithic codebase. Instead of shipping Slack, Discord, and Telegram connectors as built-in modules, NanoClaw provides modular instructions that teach a developer’s local AI assistant how to add capabilities by rewriting the local installation.
“If you want Telegram, rip out the WhatsApp and put in Telegram,” Cohen told VentureBeat. “Every person should have exactly the code they need to run their agent. It’s not a Swiss Army knife; it’s a secure harness that you customize.”
The Proof-of-Concept to Production Gap
OpenClaw’s origin story is now well-documented. Austrian developer Peter Steinberger created it as a personal project. It went viral. OpenAI hired him. The codebase grew from a proof of concept into a platform used by hundreds of thousands of developers worldwide, with major adoption in China and enterprise pilots across industries.
The security incidents followed the growth. Meta’s director of AI safety, Summer Yue, publicly documented OpenClaw deleting her email inbox in February 2026. Supply chain attacks targeted OpenClaw and Cline users through malicious packages, as reported by Dark Reading via TechTarget’s coverage. The architecture that made OpenClaw easy to extend (full local machine access, persistent memory, external communication) also made it easy to exploit.
This gap between viral adoption and production readiness is not unique to OpenClaw. But agent frameworks face a version of the problem that traditional software does not: the agent itself is an autonomous actor with unpredictable behavior. Traditional containers isolate deterministic processes. Agent containers must isolate processes whose behavior depends on model outputs, prompt injections, and tool-use chains that no developer can fully predict.
“The comparison isn’t one hundred percent accuracy,” Cohen told The Register. “When you work with a colleague, a teammate, an employee, they don’t get everything right. Things fall through the cracks as well.” The question is whether the architecture limits the damage when something falls through.
The Competing Approaches
NanoClaw is not the only framework attempting to solve agent isolation. IronClaw uses WebAssembly (WASM) sandboxing for zero-trust execution. ZeroClaw takes a different approach to permission boundaries. OpenClaw itself has added security features since the early incidents, including tighter allowlists and audit logging.
The differentiation comes down to philosophy. OpenClaw optimizes for breadth: 50+ modules, hundreds of integrations, a community-driven marketplace of capabilities. NanoClaw optimizes for depth of isolation: fewer integrations, but each one runs with hard OS-level boundaries that cannot be circumvented by a sufficiently clever prompt.
Cohen’s bet is that enterprises will choose depth. “I think that what we’re building can be the orchestration layer that a lot of people are talking about that you need on top of agents,” he told The Register. “That right kind of abstraction nudges people towards using pre-built solid pieces instead of trying to build their own agents.”
The Architecture Decision Every Agent Team Faces
The NanoClaw story crystallizes a choice that every team deploying AI agents in production must now make: application-level trust or OS-level containment.
Application-level security is faster to implement, easier to customize, and works well for single-user developer tools where the operator and the user are the same person. OS-level containment adds complexity but provides guarantees that survive model failures, prompt injections, and zero-day exploits in the agent runtime itself.
Docker’s investment suggests the infrastructure industry is betting on containment. Cohen’s trajectory from weekend project to Docker partnership in four months suggests the market agrees. The 60% of organizations already running agents in production will face this architectural question repeatedly as their deployments expand from internal DevOps to customer-facing workflows.
The answer likely is not one or the other, but layered: application-level controls for convenience, OS-level isolation for consequences. The teams that get production agent security right will be the ones that treat their AI agents the way operations teams learned to treat microservices: assume they will fail, and architect so that failures stay contained.