Anthropic announced three new capabilities for its Claude Managed Agents platform at the Code with Claude developer conference in San Francisco: a self-improving memory system called “dreaming,” rubric-based agent evaluation called “outcomes,” and multi-agent orchestration for parallel task delegation. The features collapse infrastructure layers that enterprises typically assemble from separate vendors into a single hosted runtime.
Dreaming: Cross-Session Pattern Extraction
Dreaming is a scheduled process that reviews an agent’s past sessions and memory stores, extracts patterns across them, and curates those memories so agents improve over time. Unlike conventional memory systems that persist context within a single conversation, dreaming operates across agents and sessions to surface recurring mistakes, convergent workflows, and shared team preferences, according to Anthropic.
The process does not modify underlying model weights. “We’re not changing the model itself through dreaming,” Alex Albert, who leads research product management at Anthropic, told VentureBeat. Instead, agents write plain-text notes and structured “playbooks” that future sessions reference, making the system auditable by humans. Albert compared it to how employees create documentation after completing a workflow: “Instead of you manually creating the skill from your experience working with Claude, the model is doing it.”
Ars Technica noted that dreaming addresses a practical constraint: context windows are limited, and important information gets lost over lengthy projects. Standard compaction techniques handle this within a single conversation, but dreaming pulls shared learnings across multiple agents and sessions.
Dreaming is currently in research preview with access by request.
Outcomes and Multi-Agent Orchestration Move to Public Beta
Outcomes lets teams define rubrics describing what success looks like for a given task. A separate grader evaluates agent output against those criteria in its own context window, isolated from the agent’s reasoning process. When the grader identifies gaps, the agent takes another pass. Anthropic reported that outcomes improved task success by up to 10 points over standard prompting loops in internal benchmarks, with file generation quality gains of 8.4% on docx and 10.1% on pptx files, per the company’s blog post.
Multi-agent orchestration allows a lead agent to decompose complex tasks and delegate subtasks to specialist agents, each with its own model, prompt, and tools. Specialists work in parallel on a shared filesystem with persistent events, so the lead agent can check progress mid-workflow. Every step is traceable in the Claude Console.
Both features moved from research preview to public beta with this announcement.
Early Adopter Results
Legal AI company Harvey reported roughly 6x improvement in task completion rates after implementing dreaming, according to VentureBeat. Medical document review company Wisedocs cut document review time by 50% using outcomes. Netflix is processing logs from hundreds of builds simultaneously using multi-agent orchestration.
The Lock-In Question
VentureBeat raised concerns that the combined update positions Claude Managed Agents as a direct competitor to the modular stacks most enterprises assemble from LangGraph or CrewAI for orchestration, Pinecone for memory, and DeepEval for evaluation. Consolidating memory, evals, state, and traceability into one system means Anthropic sees every decision agents make, rather than enterprises wiring separate systems together.
The platform runs on Anthropic-owned infrastructure, which could create compliance challenges for organizations that need to prove data residency. Enterprises already in large-scale AI transformations face migration costs if they switch to Managed Agents, and switching costs if they later want to leave.
Anthropic has signaled that other model providers will likely adopt similar approaches, VentureBeat reported. The argument: models may become interchangeable, but the tooling and orchestration infrastructure that wraps them will not.
The Compute Context
The announcements arrive during a period of aggressive infrastructure expansion for Anthropic. CEO Dario Amodei disclosed at the conference that the company experienced 80x annualized growth in revenue and usage in Q1 2026, far exceeding its internal projection of 10x annual growth. Average Claude Code developers now spend 20 hours per week with the tool. Anthropic also doubled rate limits for Pro and Max subscribers in response to user frustration over compute constraints.
For teams early in agent deployment, Managed Agents now offers a single platform covering execution, memory, evaluation, and coordination. For teams already invested in modular tooling, the calculation is whether consolidation gains outweigh the cost of dependency on a single vendor’s roadmap.