Anthropic released the advisor tool on the Claude Platform on April 9, making the “advisor strategy” a one-line API change for developers building agents. The approach pairs Claude Opus as a strategic advisor with Sonnet or Haiku running as the executor, keeping costs near Sonnet levels while pulling in Opus-level reasoning only when the executor gets stuck.
How It Works
The executor model (Sonnet or Haiku) runs the task end-to-end: calling tools, reading results, iterating toward a solution. When it hits a decision point beyond its capability, it escalates to Opus for guidance. Opus accesses the shared context, returns a plan, correction, or stop signal, and the executor resumes. The advisor never calls tools or produces user-facing output, according to Anthropic’s blog post.
This inverts the common sub-agent pattern where a large orchestrator decomposes work and delegates to smaller workers. In the advisor strategy, the smaller model drives and escalates without decomposition, a worker pool, or orchestration logic.
The entire handoff happens inside a single /v1/messages request. Developers declare advisor_20260301 in their tools array, set the advisor model to claude-opus-4-6, and optionally cap advisor calls with max_uses. Advisor tokens are billed at Opus rates, executor tokens at executor rates. Since the advisor typically generates 400 to 700 tokens of guidance per call, overall cost stays well below running Opus end-to-end.
Benchmark Results
Anthropic reported that Sonnet with Opus as an advisor showed a 2.7 percentage point increase on SWE-bench Multilingual over Sonnet alone, while reducing cost per agentic task by 11.9%, per Anthropic.
The Haiku numbers are more dramatic. On BrowseComp, Haiku with an Opus advisor scored 41.2%, more than double its solo score of 19.7%. That combination trails Sonnet solo by 29% in score but costs 85% less per task, making it a viable option for high-volume workloads where cost per call matters more than peak accuracy.
The Cost Architecture Shift
The advisor tool works alongside existing tools in the same API request. An agent can search the web, execute code, and consult Opus in the same loop. Pricing is tiered: advisor tokens bill at the advisor model’s rate, everything else bills at the executor’s rate.
For teams running agentic workloads at scale, the math changes meaningfully. Instead of choosing between Opus quality and Sonnet cost, the advisor strategy offers a hybrid where frontier reasoning applies only at decision points and the rest of the run stays cheap. Developers set max_uses to control how often the executor can call the advisor, providing a hard cap on the premium-tier spend.
Competitive Context
The advisor pattern sits between two existing approaches: running everything on Opus (expensive, high quality) and running everything on Sonnet (cheaper, lower ceiling). OpenAI’s Agents SDK and Google’s agent frameworks don’t currently offer an equivalent in-request model escalation mechanism. LangChain’s Deep Agents Deploy, released the same week, takes a different approach: model-agnostic deployment with no built-in tiered reasoning. Anthropic’s bet is that the quality-cost tradeoff problem is best solved inside the API, not inside the framework.