xAI's Grok Voice Think Fast 1.0 Tops Voice Agent Benchmarks, Already Running Starlink Phone Operations

xAI released Grok Voice Think Fast 1.0 on April 25, a voice agent model that ranks first on the τ-voice Bench with a 67.3% overall score. The model is already running in production at Starlink, handling live customer phone operations.

Benchmark Results

The performance gaps are large. On the τ-voice Bench, which evaluates voice agents under realistic conditions including background noise, accents, and interruptions, Grok Voice Think Fast 1.0 scored 67.3%. The next closest competitors: Gemini 3.1 Flash Live at 43.8%, xAI’s own previous Grok Voice Fast 1.0 at 38.3%, and OpenAI’s GPT Realtime 1.5 at 35.3%, according to MarkTechPost.

The vertical breakdowns tell a sharper story. In telecom scenarios (plan changes, billing disputes, troubleshooting), Grok Voice Think Fast 1.0 hit 73.7% while Gemini 3.1 Flash Live scored 21.9% and GPT Realtime 1.5 scored 21.1%. That is a 33-percentage-point gap in a single vertical. In airline scenarios (booking changes, delays, complex itineraries), it scored 66% versus 40% for Gemini and 36% for GPT Realtime. In retail (orders, returns, promotions in noisy environments), it posted 62.3% versus 44.7% for Gemini and 38.6% for GPT Realtime.

How the Architecture Works

The model processes incoming speech and generates responses simultaneously, a design known as full-duplex processing. Unlike systems that wait for the speaker to stop before generating a response, Grok Voice Think Fast 1.0 evaluates mid-sentence utterances in real time, determining whether they represent corrections, clarifications, or filler words.

Reasoning runs in the background with no added latency, according to xAI’s announcement. The model can also invoke external APIs during a call without introducing pauses. xAI demonstrated this with an edge case: when asked which months of the year contain the letter X, Grok Voice Think Fast 1.0 correctly answered “none,” while competing models confidently said “February.”

Starlink Deployment

The model is already handling live phone operations at Starlink, according to MarkTechPost. This makes it one of the few voice agent models with a verified production deployment at enterprise scale, not just benchmark results.

Grok Voice Think Fast 1.0 is available via the xAI API for customer support, sales, and enterprise agent workflows, as reported by CastleCrypto.

The Voice Agent Market Shift

Voice agents have been the weakest link in the agentic AI stack. Text-based agents can plan, reason, and use tools. Voice agents historically struggled with basic call handling. The τ-voice Bench was created specifically to test production readiness under messy real-world conditions, and until now, no model had cracked 50%.

xAI’s 67.3% score combined with a live Starlink deployment suggests voice agents may be crossing the threshold from demo to production. For enterprises running contact centers, the question shifts from “can voice agents work?” to “which vendor’s voice agent works best for our vertical?”

xAI's Grok Voice Think Fast 1.0 Tops Voice Agent Benchmarks, Already Running Starlink Phone Operations

Benchmark Results

How the Architecture Works

Starlink Deployment

The Voice Agent Market Shift

Get our morning briefing in your inbox

Keep Reading

Barret Zoph Exits OpenAI for Second Time After Five Months as Enterprise Head

Yahoo DSP Launches Agent Network With 30+ Partners Across Ad-Tech Workflow

Omdia: Agentic AI Is Forcing AWS, Google, and Microsoft to Redesign Their Cloud Infrastructure