Datadog published its 2026 State of AI Engineering report on April 29, drawing from LLM telemetry across more than a thousand customers to map how production AI workloads are actually evolving. The headline finding: agent framework adoption nearly doubled year-over-year, from 9% of organizations in early 2025 to 18% by the beginning of 2026, according to the report.
The report distinguishes between “AI applications” (production services making LLM calls) and “agents” (the subset using multi-step control flow, tool execution, or multiple service calls). Both categories are growing, but the agent segment is where the operational complexity concentrates.
The Multi-Provider Shift
OpenAI still leads with 63% provider share, but that’s down from 75% a year ago. Google Gemini and Anthropic Claude gained 20 and 23 percentage points respectively over the past year, according to Datadog.
The decline in share does not mean a decline in absolute usage. Datadog found that the number of customers using OpenAI more than doubled, even as competitors grew faster. The market is expanding, not zero-sum.
Over 70% of organizations now use three or more models in production, and the share using more than six models nearly doubled. Teams are building model portfolios, matching specific models to workload requirements across latency, cost, and task complexity.
Model Churn as Governance Problem
Teams adopt new models quickly. Claude Sonnet 4.6 reached 17% adoption in its first month after release. But older models persist: Sonnet 4.5 and GPT-4o still sat at 19% and 22% adoption respectively as of March 2026, at similar levels to their newer counterparts.
The report frames this as a governance challenge. Organizations add models faster than they retire them. Each overlapping model increases operational overhead and evaluation burden, since the same prompts, tools, and agent workflows produce different results across models.
GPT-4o illustrates the tension: it was still the most common model in Datadog’s March 2026 request traces, yet OpenAI has already retired it from the ChatGPT UI, making the future of API support uncertain, according to Datadog.
Framework Adoption and Operational Complexity
The doubling of framework adoption (LangChain, Pydantic AI, LangGraph, Vercel AI SDK) held consistent across startups, mid-market, and enterprise-level organizations. Frameworks accelerate development by making common patterns easy to add, but they also introduce operational complexity that requires comprehensive agent telemetry.
The report identifies a specific risk: teams importing framework boilerplate without understanding the execution overhead. Datadog recommends that teams use agent telemetry to “see how agents execute and identify inefficient imported logic that can be replaced with bespoke workflows.”
The Production Engineering Gap
The core thesis of the report is that AI engineering is converging with traditional production engineering: routing, lifecycle management, capacity planning, cost control, and debugging across distributed systems. The difference is that model, prompt, or retrieval changes can shift latency, cost, and failure rates without an obvious code change.
For teams running agents in production, observability is no longer optional. Multi-provider routing, model fleet management, and continuous evaluation are becoming standard infrastructure requirements. Datadog’s positioning here is obvious: the company’s LLM observability product sits directly on top of this trend. But the underlying data, drawn from real production telemetry rather than surveys, makes this one of the more grounded snapshots of where enterprise AI engineering stands in mid-2026.
“Most teams are using multiple models in production now. Around 70% are running three or more, and that number keeps growing, with agents accelerating the trend,” OpenRouter co-founder and CTO Alex Atallah said in the report.