Parasail, a San Francisco-based AI infrastructure startup founded by former Groq executive Mike Henry, has closed a $32 million Series A co-led by Touring Capital and Kindred Ventures, bringing total funding to $42 million. Samsung NEXT and Flume also participated in the round, according to TechCrunch and SiliconANGLE.
Parasail operates what it calls an “AI Supercloud,” a pay-per-token inference platform that lets developers deploy AI workloads with as few as five lines of code and no long-term GPU contracts. The company rents processing time across 40 data centers in 15 countries, orchestrating workload allocation behind the scenes to drive down inference costs. It currently generates 500 billion tokens per day.
The platform is inference-only. No training workloads allowed. Henry, who built Groq’s cloud offering before founding Parasail, told TechCrunch that this focus and willingness to serve startup customers without long-term commitments differentiates Parasail from hyperscalers and even better-funded inference competitors like Fireworks AI and Baseten.
The Agent-Driven Demand Thesis
The investment thesis is straightforward: as AI agents proliferate and split tasks across multiple model calls, inference volume grows faster than model capability. Samir Kumar, a partner at Touring Capital, told TechCrunch he expects inference to account for at least 20% of the cost of building software in the future.
Andreas Stuhlmüller, CEO of research assistant startup Elicit, described the pattern driving demand. His pharmaceutical customers use open models for initial screening of tens of thousands of scientific papers, with a more capable frontier model providing the final answer. “We’ve moved more towards open models because it’s pretty rough sending 100,000s of requests to an API endpoint,” Stuhlmüller told TechCrunch.
Steve Jang, a partner at Kindred Ventures, was blunt about the market sizing: “Everyone thought there was an AI bubble. There’s no AI bubble. Inference demand is far outstripping supply,” he told TechCrunch.
Product Details
Parasail offers three tiers of service, according to SiliconANGLE: two serverless hosting options that automate GPU cluster management, dedicated endpoints with configurable autoscaling and quantization support for performance-sensitive workloads, and a batch processing service for cost-optimized bulk inference. The most advanced GPU currently available on the platform is the Nvidia H200.
The company plans to use the funding to enhance inference workload optimization, strengthen its partner ecosystem, and invest in go-to-market initiatives, per the PRNewswire announcement.
The Compute Brokerage Layer
Parasail occupies a specific layer of the emerging agent infrastructure stack: the inference routing and pricing layer between developers and GPU providers. It abstracts multi-provider compute into a single developer-facing platform where AI agent capacity is provisioned in minutes, not weeks. For teams building agent products that need to scale inference across providers without managing GPU procurement directly, Parasail is now a funded option with $42 million in total capital behind it.