Sam Altman told an OpenAI enterprise event on June 3 that AI token costs have gone from invisible to urgent in under six months. “At the beginning of 2026, the issue never came up. People were totally happy with the amount they were spending,” Altman said. “Now, AI costs are a huge issue,” Business Insider reported.
The numbers illustrate why. OpenAI’s top internal token spender now consumes 100 billion tokens per month, according to Altman. Six and a half years ago, the company’s top user burned through 100,000 tokens monthly, which Altman said was “very likely the token leader in the world” at the time. That 100,000 figure is now roughly the per capita global average, Business Insider reported.
Even OpenAI’s internal top spender is not the highest in the world. Altman acknowledged finding someone outside the company who spends more, calling it a personal “embarrassment.”
The Agent Cost Multiplier
The spending spike coincides with the proliferation of autonomous agents, which multiply API calls per user interaction. Long-running agent workflows that persist for hours or days consume orders of magnitude more tokens than single-prompt interactions.
The reference case: OpenClaw creator Peter Steinberger’s team spent $1.3 million on OpenAI API tokens in a single month, totaling 603 billion tokens across approximately 7.6 million requests, according to Business Insider and Tom’s Hardware. The New York Times reported that one OpenAI employee spent 210 billion tokens in a single week.
OpenAI maintains a token leaderboard internally, and employees sometimes post their totals on X, per Business Insider. The culture of conspicuous token spending has become a meme in startup circles.
Enterprise Pushback Is Already Here
Not every organization is treating token consumption as a flex. Amazon shut down its token leaderboard, according to Business Insider. Uber reportedly set token caps after COO Andrew Macdonald said the spending was becoming harder to justify, Business Insider reported.
Altman referenced the meme directly at the event: “My company spent my entire 2026 budget in Q1, can you make this more efficient?” He said OpenAI was “continuing to push its models and explore other ways to deliver more value for less spend,” according to Business Insider.
The Agent Builder Constraint
For teams building on agentic architectures, inference cost is now the primary scaling constraint. A single agent workflow can consume thousands of tokens per step across planning, tool use, reflection, and error correction. Multi-agent systems compound this further, with orchestrators spawning sub-agents that each run their own inference loops.
The cost pressure creates a market opening for on-device inference, smaller specialized models, aggressive caching strategies, and agent frameworks that minimize unnecessary API calls. It also raises a strategic question: if the foundation model providers themselves acknowledge that their pricing is becoming unsustainable at agent-scale usage, the economics of the agent infrastructure layer are due for a reset.