AMD AI Director Says Claude Code 'Cannot Be Trusted' for Complex Engineering After February Update Regression

Stella Laurenzo, director of AMD’s AI group and former Google OpenXLA lead, filed a GitHub issue on Friday declaring Claude Code “unusable for complex engineering tasks.” The criticism is backed by quantitative data: 6,852 sessions, 234,760 tool calls, and 17,871 thinking blocks analyzed across months of use in what Laurenzo described as a “very consistent, high complexity work environment.”

The numbers are specific. According to The Register, stop-hook violations (catching ownership dodging, premature cessation, and permission-seeking behavior) went from zero before March 8 to an average of 10 per day through the end of March. The average number of code reads before making edits dropped from 6.6 to 2. Claude began rewriting entire files instead of making targeted edits with increasing frequency.

The Thinking Depth Problem

Laurenzo’s analysis points to a root cause: the early March deployment of thinking content redaction in Claude Code version 2.1.69. This feature defaults to stripping thinking content from API responses, meaning users cannot see what reasoning Claude performs before acting.

“When thinking is shallow, the model defaults to the cheapest action available: edit without reading, stop without finishing, dodge responsibility for failures, take the simplest fix rather than the correct one,” the GitHub issue explains. One external analysis cited by TechRadar described a 67% drop in thinking depth following the update.

“Every senior engineer on my team has reported similar experiences/anecdotes,” Laurenzo wrote.

The Enterprise Signal

This is not a hobbyist complaint. Laurenzo runs the AI division at a $200B+ semiconductor company. When someone at that level takes time to file a public GitHub issue with session-level data, the tool failed badly enough to affect engineering operations.

The Reddit thread linked in the issue shows broader developer sentiment aligning with Laurenzo’s experience. Multiple users report the same pattern: Claude Code performing reliably for months, then degrading after recent updates.

Laurenzo’s proposed fixes are practical. She wants Anthropic to expose thinking token counts per request so users can “monitor whether their requests are getting the reasoning depth they need.” She also asked for a premium thinking tier for engineers running complex workflows. “The current subscription model doesn’t distinguish between users who need 200 thinking tokens per response and users who need 20,000,” she wrote.

Anthropic’s Position

The regression comes at an awkward moment. Anthropic today launched Claude Managed Agents, a managed infrastructure service positioning Claude as the runtime for enterprise agent deployments. Selling managed agent infrastructure while the underlying coding agent faces public reliability complaints from a major chipmaker’s AI team creates a credibility tension that Anthropic will need to address.

The company has not issued a detailed response to Laurenzo’s GitHub issue. According to The Register, this adds to a recent string of incidents including unexplained token usage surges and the accidental exposure of Claude Code’s entire source code.

AMD AI Director Says Claude Code 'Cannot Be Trusted' for Complex Engineering After February Update Regression

The Thinking Depth Problem

The Enterprise Signal

Anthropic’s Position

Get our morning briefing in your inbox

Keep Reading

European AI Agent Startups Raised €6.2 Billion in 2025 Across 429 Deals, Sifted Reports

Astropad Launches Workbench, a Remote Desktop App Built for Managing Mac AI Agents From iPhone and iPad

Anthropic Launches Claude Managed Agents in Public Beta, Entering the Enterprise Agent Infrastructure Market