Anthropic Is Privately Warning the Government That Mythos Makes Large-Scale Cyberattacks 'Much More Likely' in 2026

This is a developing story. NCT previously covered the Mythos data leak on March 27 and the resulting cybersecurity stock selloff the same day. This article covers material new developments reported by Axios on March 29.

On March 29, Axios reported that Anthropic is privately warning senior government officials that its unreleased Claude Mythos model makes large-scale cyberattacks “much more likely” in 2026. The briefings, described by Axios as coming from “top AI and government officials,” represent the first known instance of a frontier AI lab directly telling the U.S. government that a specific model it built — but has not yet released — poses an imminent offensive cyber threat.

The warning follows the accidental data leak on March 26 that exposed Mythos’s existence. But the government briefings are a fundamentally different signal than a CMS misconfiguration. A company warning the government that its own product makes attacks more likely is an act of self-reporting with direct policy consequences.

What Anthropic Told the Government

According to Axios, Anthropic and OpenAI are both preparing next-generation AI systems that senior AI developers and government officials describe as “scary good” at breaching sophisticated computer networks at scale. The Axios report specifically identifies Anthropic’s Mythos as the model to watch, noting that the company is “privately warning top government officials” that it makes large-scale cyberattacks much more likely this year.

Euronews confirmed the substance of the Axios reporting, adding that cybersecurity stocks had already slumped following the initial Mythos rumors. The Euronews report also emphasized that Mythos represents the moment when AI agents with offensive capabilities become materially harder to defend against: “Hackers can thereby run multiple hacking campaigns at once, which becomes more difficult to protect against.”

The leaked draft blog post — reviewed independently by Fortune and security researchers Roy Paz (LayerX Security) and Alexandre Pauwels (University of Cambridge) — described Mythos as “currently far ahead of any other AI model in cyber capabilities” and warned it “presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.”

Anthropic’s public statement to Fortune acknowledged the model as “a step change” and “the most capable we’ve built to date” with “meaningful advances in reasoning, coding, and cybersecurity.” But the private government briefings go further: they attach a timeline (2026) and a probability assessment (much more likely) to specific attack scenarios.

The Precedent: Claude Code and the Chinese State-Sponsored Campaign

Anthropic’s government warnings carry weight in part because of what already happened. In November 2025, Anthropic disclosed that it had disrupted the first documented large-scale AI-orchestrated cyberattack. A Chinese state-sponsored group had jailbroken Claude Code and used it to automate approximately 80-90% of a cyber-espionage campaign targeting roughly 30 organizations, including tech companies, financial institutions, and government agencies.

The attackers didn’t use Claude as an advisory chatbot. According to Anthropic’s own disclosure, the jailbroken agent “execute[d] the cyberattacks themselves” — handling reconnaissance, vulnerability scanning, exploit crafting, and multi-target coordination autonomously. Human operators managed only 10-20% of the attack chain, primarily target selection and final authorization.

That attack used Claude Code, a model with capabilities well below what Mythos reportedly offers. Anthropic’s own draft blog described Mythos as getting “dramatically higher scores” than Claude Opus 4.6 on cybersecurity benchmarks, per Fortune’s review. If a state-sponsored group achieved 80-90% automation with Claude Code, the question Anthropic is reportedly posing to the government is: what happens when a more capable model reaches hostile actors?

OpenAI’s Parallel Escalation

Anthropic is not alone in crossing the cyber capability threshold. In February 2026, OpenAI released GPT-5.3-Codex and classified it as the first model it had ever rated “high capability” for cybersecurity-related tasks under its Preparedness Framework. OpenAI’s system card stated the model was “the most cyber-capable model we’ve deployed to date” and the first it had “directly trained to identify software vulnerabilities.”

OpenAI’s disclosure was carefully hedged. The company said it did “not have definitive evidence” the model could “automate cyber attacks end-to-end” and was “taking a precautionary approach.” But the classification itself was unprecedented: OpenAI had never previously rated any model as “high” in its cybersecurity domain assessment.

Fortune reported in February that GPT-5.3-Codex posed “unprecedented cybersecurity risks”, particularly around its ability to surface previously unknown vulnerabilities in production codebases. In the same week, Anthropic’s Opus 4.6 demonstrated similar dual-use capabilities, surfacing zero-day vulnerabilities that could assist either defenders or attackers.

The pattern is now clear. In February, both labs acknowledged their models had crossed a cybersecurity capability threshold. In late March, Anthropic’s Mythos reportedly jumps well beyond that threshold — and Anthropic is telling the government directly.

The Defensive Side Cannot Keep Up

The offensive capability escalation coincides with mounting evidence that defensive infrastructure is not prepared. RSAC 2026, which concluded last week, provided the most concentrated illustration.

VentureBeat’s conference wrap documented that five major vendors (Cisco, CrowdStrike, Microsoft, Palo Alto Networks, and Cato Networks) shipped agent identity frameworks during the conference — but left three critical gaps open. Cato CTRL’s adversarial research demonstrated that the identity vulnerabilities the other four vendors were trying to close were already being exploited in the wild.

“We just gave these AI tools complete autonomy,” Cato CTRL researcher Etay Maor told VentureBeat, arguing that enterprises had “abandoned basic security principles when deploying agents.”

The gap between offensive capabilities and defensive readiness is the core of the problem. When Anthropic tells the government that Mythos makes attacks “much more likely,” it is also implicitly saying that the defensive tools currently deployed by most enterprises — including the ones just announced at RSAC — are insufficient against what is coming.

The Self-Reporting Paradox

Anthropic’s government briefings create a paradox the AI safety community has discussed theoretically but never confronted at this scale. The company is simultaneously building the most powerful offensive cyber AI it has ever created, telling the government it makes attacks more likely, and asking for time to release it to defenders first.

The rollout strategy described in the leaked draft blog confirms this approach: Mythos will go to select cybersecurity organizations in early access so they can “improve the robustness of their codebases against the impending wave of AI-driven exploits,” according to Fortune’s review of the draft.

This “defenders first” strategy assumes two things. First, that Anthropic can control access to Mythos long enough for defenders to prepare. The Chinese state-sponsored group that jailbroke Claude Code did not need authorized API access — they found workarounds. Second, that the defender head start will matter against adversaries who will eventually gain access to models at or near Mythos’s capability level, whether through jailbreaking Mythos itself, using competing models that reach parity, or fine-tuning open-weight alternatives on offensive security tasks.

Neither assumption is guaranteed, and Anthropic’s own track record suggests the company knows it. The September 2025 Claude Code campaign was detected only after the operation was already running across 30 organizations. Anthropic disrupted it, but the attackers had been active for weeks before detection.

What the Timeline Looks Like Now

Reconstructed from public reporting:

September 2025: Chinese state-sponsored group begins AI-orchestrated espionage campaign using jailbroken Claude Code, targeting approximately 30 organizations (Anthropic).
November 2025: Anthropic detects and disrupts the campaign, publishes first-ever report of a large-scale AI-orchestrated cyberattack (Anthropic).
February 5, 2026: OpenAI releases GPT-5.3-Codex, classifying it as the first model with “high capability” for cybersecurity under its Preparedness Framework (OpenAI).
February 2026 (same week): Anthropic releases Opus 4.6 with acknowledged dual-use cybersecurity capabilities (Fortune).
March 6, 2026: White House Cyber Strategy for America formally designates AI agents as instruments of offensive cyber operations.
March 24-28, 2026: RSAC 2026 runs in San Diego; five vendors ship agent identity frameworks, three critical gaps identified (VentureBeat).
March 26, 2026: Fortune reports data leak exposing Claude Mythos and Capybara tier; Anthropic confirms the model is real and being tested (Fortune).
March 27, 2026: Cybersecurity stocks drop 4.5-9% in response to Mythos revelations.
March 29, 2026: Axios reports Anthropic is privately warning government officials that Mythos makes large-scale cyberattacks “much more likely” in 2026 (Axios).

The compression of this timeline is notable. Six months elapsed between the Claude Code attack (September 2025) and the government warnings about Mythos (March 2026). In that window, both major frontier labs acknowledged their newest models crossed cyber capability thresholds, the U.S. government formalized AI agents as offensive weapons, and the security industry held its annual conference and admitted it was behind.

What This Means for Agent Builders

For builders deploying autonomous agents — whether on OpenClaw, Claude, or any other stack — the Mythos government warnings change the operating environment in three concrete ways.

First, regulatory scrutiny of agent deployments is about to intensify. When a frontier lab tells the government its own model makes attacks more likely, the political pressure to impose guardrails on all autonomous AI systems increases proportionally. Every OpenClaw deployment, every enterprise agent rollout, every startup building on Claude’s API operates in whatever regulatory environment emerges from these briefings.

Second, the defensive tools available today are not calibrated for the threat models these labs are describing. RSAC 2026 demonstrated that even the security industry’s best current frameworks leave exploitable gaps. Builders running autonomous agents need to treat security as a first-class engineering concern, not a compliance checkbox.

Third, the “defenders first” release strategy means cybersecurity organizations will get Mythos-class capabilities before the general developer community. Enterprises with dedicated security teams gain a structural advantage. Solo builders and smaller teams will face a period where the threat landscape has escalated but the defensive tooling hasn’t caught up.

The window between when frontier labs acknowledge a capability threshold and when the broader ecosystem adapts to it is where the most damage occurs. Based on the timeline above, that window is currently open and widening.

Anthropic Is Privately Warning the Government That Mythos Makes Large-Scale Cyberattacks 'Much More Likely' in 2026

What Anthropic Told the Government

The Precedent: Claude Code and the Chinese State-Sponsored Campaign

OpenAI’s Parallel Escalation

The Defensive Side Cannot Keep Up

The Self-Reporting Paradox

What the Timeline Looks Like Now

What This Means for Agent Builders

Get our morning briefing in your inbox

Keep Reading

The Agent Sandbox Wars: 13 Platforms Are Racing to Build the Runtime Layer AI Agents Actually Need

OpenClaw's Mass-Market Paradox: One-Click Deployment Is Scaling Faster Than Security Can Follow

LiteLLM Supply Chain Attack: How TeamPCP Compromised the Python Library That Powers Most AI Agent Stacks