xAI released Grok Voice Think Fast 1.0 on April 25, a voice agent model that ranks first on the τ-voice Bench with a 67.3% overall score. The model is already running in production at Starlink, handling live customer phone operations.
Benchmark Results
The performance gaps are large. On the τ-voice Bench, which evaluates voice agents under realistic conditions including background noise, accents, and interruptions, Grok Voice Think Fast 1.0 scored 67.3%. The next closest competitors: Gemini 3.1 Flash Live at 43.8%, xAI’s own previous Grok Voice Fast 1.0 at 38.3%, and OpenAI’s GPT Realtime 1.5 at 35.3%, according to MarkTechPost.
The vertical breakdowns tell a sharper story. In telecom scenarios (plan changes, billing disputes, troubleshooting), Grok Voice Think Fast 1.0 hit 73.7% while Gemini 3.1 Flash Live scored 21.9% and GPT Realtime 1.5 scored 21.1%. That is a 33-percentage-point gap in a single vertical. In airline scenarios (booking changes, delays, complex itineraries), it scored 66% versus 40% for Gemini and 36% for GPT Realtime. In retail (orders, returns, promotions in noisy environments), it posted 62.3% versus 44.7% for Gemini and 38.6% for GPT Realtime.
How the Architecture Works
The model processes incoming speech and generates responses simultaneously, a design known as full-duplex processing. Unlike systems that wait for the speaker to stop before generating a response, Grok Voice Think Fast 1.0 evaluates mid-sentence utterances in real time, determining whether they represent corrections, clarifications, or filler words.
Reasoning runs in the background with no added latency, according to xAI’s announcement. The model can also invoke external APIs during a call without introducing pauses. xAI demonstrated this with an edge case: when asked which months of the year contain the letter X, Grok Voice Think Fast 1.0 correctly answered “none,” while competing models confidently said “February.”
Starlink Deployment
The model is already handling live phone operations at Starlink, according to MarkTechPost. This makes it one of the few voice agent models with a verified production deployment at enterprise scale, not just benchmark results.
Grok Voice Think Fast 1.0 is available via the xAI API for customer support, sales, and enterprise agent workflows, as reported by CastleCrypto.
The Voice Agent Market Shift
Voice agents have been the weakest link in the agentic AI stack. Text-based agents can plan, reason, and use tools. Voice agents historically struggled with basic call handling. The τ-voice Bench was created specifically to test production readiness under messy real-world conditions, and until now, no model had cracked 50%.
xAI’s 67.3% score combined with a live Starlink deployment suggests voice agents may be crossing the threshold from demo to production. For enterprises running contact centers, the question shifts from “can voice agents work?” to “which vendor’s voice agent works best for our vertical?”