ChatGPT and Codex Go Down for Thousands, Raising Questions About Agent Infrastructure Resilience

OpenAI’s ChatGPT and Codex services experienced a partial outage on April 20, with Downdetector reports climbing past 5,000 users within an hour of the first disruptions. OpenAI’s status page confirmed the company was “investigating” the issue, later updating to “monitoring the recovery.”

Timeline

The outage began around 10:40 AM ET (14:40 UTC), according to GVWire’s live tracking. By 11:01 AM ET, reports had reached nearly 4,000. By 11:14 AM ET, over 5,000 users reported they were unable to load ChatGPT or Codex. OpenAI’s status page acknowledged that “users are unable to load ChatGPT and Codex” and confirmed both services were affected.

TechRadar reported that affected users experienced blank pages, extended response times, and failures when starting new chat sessions. As of 12:03 PM ET (17:03 UTC), OpenAI was still investigating.

The Agent Dependency Problem

For individual ChatGPT users, an outage is an inconvenience. For the growing number of production systems that route agent workflows through OpenAI’s API, it is an operational failure. Codex, OpenAI’s coding agent product, was also affected, meaning automated development pipelines relying on Codex would have stalled.

This outage follows a pattern. GitHub reported last week that AI agents now drive 275 million commits per week, with 17 million agent-opened pull requests in March 2026 alone. Many of those agents depend on OpenAI models for reasoning and code generation. When the underlying inference layer goes down, every agent built on top of it goes down with it.

The architecture question for teams running agent systems is straightforward: how many of your autonomous workflows have a fallback provider? Multi-model routing, where agent orchestrators can switch between OpenAI, Anthropic, Google, or open-weight models during outages, remains the exception rather than the norm. Platforms like Cloudflare’s AI Gateway (announced last week with support for 70+ models across 12+ providers) are built precisely for this scenario, but adoption is still early.

Recovery

OpenAI updated its status page to indicate it was “monitoring the recovery” as of early afternoon ET. The company has not disclosed the root cause. For the teams whose agent pipelines were disrupted, the fix is the same one infrastructure engineers have recommended since the cloud era began: don’t build single points of failure into critical paths.

ChatGPT and Codex Go Down for Thousands, Raising Questions About Agent Infrastructure Resilience

Timeline

The Agent Dependency Problem

Recovery

Get our morning briefing in your inbox

Keep Reading

AIxCrypto Launches Agentir for On-Chain AI Agent Simulation and Testing

India's AI Agent Market Splits: Emergent Hits $1.5B Valuation as Krutrim Shuts Down Agent Platform

AWS Gives AI Agents Their Own Managed Desktop to Operate Legacy Applications Without API Rewrites