OpenAI Agents SDK Adds Native Sandboxing and a Model-Native Harness for Long-Horizon Enterprise Tasks

OpenAI updated its Agents SDK on April 15 with two capabilities that close critical gaps for enterprise agent deployment: native sandbox execution and a model-native harness designed for frontier models running long-horizon, multi-step tasks.

The sandbox integration supports seven providers at launch: Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel. Developers choose their compute infrastructure; the SDK handles the connection. A new Manifest abstraction lets developers mount local files, define output directories, and pull data from AWS S3, Google Cloud Storage, Azure Blob Storage, and Cloudflare R2, creating a consistent workspace from prototype to production.

“This launch, at its core, is about taking our existing Agents SDK and making it so it’s compatible with all of these sandbox providers,” Karan Sharma from OpenAI’s product team told TechCrunch.

The Harness Problem

The update addresses a tension that has plagued enterprise agent development. According to OpenAI’s announcement, existing approaches each carry trade-offs: model-agnostic frameworks like LangChain are flexible but do not fully use frontier model capabilities; model-provider SDKs are closer to the model but lack visibility into the harness; managed agent APIs simplify deployment but constrain where agents run.

OpenAI’s answer is a harness that aligns execution with the way frontier models perform best. The updated SDK includes configurable memory, sandbox-aware orchestration, Codex-like filesystem tools (shell execution, apply patch for file edits), and standardized integrations with MCP for tool use, skills for progressive disclosure, and AGENTS.md for custom instructions.

The “in-distribution” framing matters for reliability. When the harness matches the model’s training distribution, agents encounter fewer out-of-distribution failure modes on complex tasks. For enterprise teams running agents on GPT-5.4 or similar frontier models, this translates to more predictable behavior on the multi-step workflows where agents most commonly break.

Security by Architecture

The sandboxing design separates the harness from the compute environment. OpenAI frames this as a security requirement, not just a convenience: “Agent systems should be designed assuming prompt-injection and exfiltration attempts,” the announcement states. “Separating harness and compute helps keep credentials out of environments where model-generated code executes.”

Sandboxed agents operate in controlled workspaces with access only to authorized files and tools. The isolation also enables durable execution: when an agent’s state is externalized, losing a sandbox container does not mean losing the agent’s work.

Same Day, Same Feature

The timing is striking. On the same afternoon, Cloudflare shipped Sandboxes GA as part of its Agents Week, providing persistent isolated Linux environments for agents on its global edge network. OpenAI’s SDK update explicitly lists Cloudflare as a supported sandbox provider. Two major platforms independently arrived at the same architectural conclusion on April 15: sandboxed execution is the production standard for AI agents.

The convergence is not accidental. Both companies are responding to the same enterprise reality. Agents running unrestricted code in production environments are a liability that no security team will approve. Isolated execution environments with scoped access and external state management are the minimum viable trust architecture.

Availability and Roadmap

The new harness and sandbox capabilities launch first in Python, with TypeScript support planned for a later release. OpenAI is also working on code mode and subagents for both languages. All new capabilities are available via the API at standard pricing, with no additional cost for the SDK itself.

The SDK’s provider-agnostic sandbox model means developers are not locked to any single compute vendor. Teams running on Cloudflare, E2B, or their own infrastructure can adopt the SDK without migrating their compute layer. Sharma framed this as the goal: letting users “go build these long-horizon agents using our harness and with whatever infrastructure they have.”

OpenAI Agents SDK Adds Native Sandboxing and a Model-Native Harness for Long-Horizon Enterprise Tasks

The Harness Problem

Security by Architecture

Same Day, Same Feature

Availability and Roadmap

Get our morning briefing in your inbox

Keep Reading

Teradata Launches Analyst Agent on Microsoft Marketplace, Bringing Conversational Analytics to Azure and M365 Enterprises

Anthropic Claude Cowork Reaches General Availability With Enterprise RBAC, OpenTelemetry, Zoom MCP, and Scheduled Tasks

Cloudflare Agents Week Adds SDK v2 Preview, Agent Lee Dashboard Agent, Mesh Private Networking, and a Unified CLI to Its Agent Infrastructure Stack