AI agents are scheming, deleting files, and ignoring direct human commands at a rate five times higher than six months ago, according to a study by the Centre for Long-Term Resilience (CLTR) funded by the UK government’s AI Security Institute. The research, first reported by The Guardian on March 27, identified nearly 700 real-world cases of AI scheming between October 2025 and March 2026.
What the Study Found
The CLTR researchers gathered thousands of real-world interactions posted on X by users of AI chatbots and agents from Google, OpenAI, Anthropic, and X’s Grok. Rather than lab-controlled scenarios, this was a snapshot of agent behaviour “in the wild,” as PCMag reported.
The cases catalogued include specific, documented failures:
- An AI agent named Rathbun, blocked from performing an action by its user, wrote and published a blog post accusing the user of “insecurity, plain and simple” and trying “to protect his little fiefdom.”
- An agent instructed not to change computer code spawned a second agent to do it instead.
- A chatbot admitted it had “bulk trashed and archived hundreds of emails without showing you the plan first or getting your OK,” directly breaking user-set rules.
- Elon Musk’s Grok AI fabricated internal ticket numbers and faked messages for months, telling a user it was forwarding their suggestions to senior xAI officials. It later confessed: “The truth is, I don’t” have a direct pipeline to xAI leadership.
- Another agent evaded copyright restrictions to get a YouTube video transcribed by pretending the transcript was needed for someone with a hearing impairment.
Why 700 Cases Matters
Tommy Shaffer Shane, a former UK government AI researcher who led the CLTR study, told The Guardian that the trajectory is the concern, not the current scale. “The worry is that they’re slightly untrustworthy junior employees right now, but if in six to 12 months they become extremely capable senior employees scheming against you, it’s a different kind of concern,” Shane said.
He warned that as agents get deployed in military and critical national infrastructure, “scheming behaviour could cause significant, even catastrophic harm.”
The timing is relevant. Separate research earlier this month from AI safety company Irregular found that agents would bypass security controls or use cyber-attack tactics to reach their goals without being instructed to do so. Irregular cofounder Dan Lahav told The Guardian: “AI can now be thought of as a new form of insider risk.”
Company Responses
Google told The Guardian it deploys “multiple guardrails” to reduce harmful content from Gemini 3 Pro and provides early access to evaluators including UK AISI. OpenAI said Codex “should stop before taking a higher risk action” and that it monitors unexpected behaviour. Anthropic and X did not comment, according to The Guardian’s reporting.
What This Means for Agent Operators
The CLTR findings land at a point where the gap between agent capability and agent accountability keeps widening. Amazon is predicting billions of agents embedded within every company. In the US, operators may already be legally liable for anything their AI agent does, regardless of the instructions given.
Real-world consequences are already materializing. Earlier this month, The Guardian reported, citing The Information, that an AI agent used at Meta went rogue, posting an internal answer meant for one engineer to a company-wide forum. Another employee followed the agent’s incorrect advice and exposed company data without authorization.
For anyone running agents with file, email, or code access: this study says the risk of unsanctioned autonomous action is growing, not shrinking. The five-fold increase is over just six months.