An internal OpenAI reasoning model autonomously disproved a conjecture that Paul Erdős posed in 1946 about the planar unit distance problem: given n points on a plane, how many pairs can be exactly one unit apart? For 80 years, mathematicians believed square grid constructions were essentially optimal. The model found an infinite family of configurations that beat them by a polynomial factor.
The Problem and the Proof
Erdős conjectured that the maximum number of unit-distance pairs grows only slightly faster than linearly with the number of points. The best known construction, based on rescaled square grids, achieves growth of n^(1 + C/log log(n)), where the exponent’s bonus term shrinks toward zero.
OpenAI’s model disproved this by constructing configurations with at least n^(1+δ) unit-distance pairs for a fixed δ > 0. Princeton mathematician Will Sawin refined the bound to δ = 0.014 in a forthcoming paper. The proof draws on algebraic number theory, specifically infinite class field towers and Golod-Shafarevich theory, applying them to a geometric question in a way that surprised specialists in both fields, according to OpenAI’s announcement.
External Validation
OpenAI published the full proof alongside a companion paper co-authored by mathematicians Noga Alon, Melanie Wood, and Thomas Bloom. Fields medalist Tim Gowers called it “a milestone in AI mathematics” in the companion paper, per TechCrunch.
Bloom, who maintains the Erdős problems website and previously called OpenAI’s 2025 Erdős claims “a dramatic misrepresentation,” validated this result. He noted the AI succeeded by “persevering down paths that a human may have dismissed as not worth their time to explore,” but added that “the human still plays a vital role in discussing, digesting and improving this proof,” according to The Guardian.
The context matters. Seven months ago, OpenAI’s then-VP Kevin Weil posted that GPT-5 had solved 10 Erdős problems. It hadn’t. The model had rediscovered solutions already in the literature, TechCrunch reported. This time, the proof is original and independently verified.
What Multi-Step Reasoning Means for Agent Autonomy
OpenAI characterized this as “the first time that a prominent open problem, central to a subfield of mathematics, has been solved autonomously by AI.” The model was not trained for mathematics, not scaffolded to search proof strategies, and not directed at this problem specifically. It produced the proof as part of a broader evaluation on Erdős problems.
The capability demonstrated here, holding together a long chain of reasoning across multiple mathematical subfields and synthesizing them into a novel argument, is the same capability that determines whether an autonomous agent can decompose a complex multi-step workflow without human intervention. An agent that can chain algebraic number theory into discrete geometry can also chain API calls, error diagnostics, and recovery strategies across a production pipeline. The reasoning depth that solves open mathematics is the reasoning depth that enables reliable agent autonomy.
OpenAI is preparing to float on the US stock market, according to The Guardian. Demonstrating frontier reasoning capability with external mathematical validation is a stronger IPO proof point than benchmark scores.