Apple used WWDC26 to ship the infrastructure that makes every iPhone app a potential AI agent host. The Foundation Models framework, announced June 9 and now available in the iOS 27 developer beta, is a native Swift API that gives developers direct access to Apple’s on-device models, third-party providers like Anthropic’s Claude and Google’s Gemini, and any other model that conforms to Apple’s new Language Model protocol. The framework runs alongside a ground-up Siri rebuild and a set of developer tools explicitly designed for what Apple calls “agentic app experiences.”
This is Apple’s entry into the agent infrastructure race, and its weapon is distribution: over 2 billion active devices that already run the apps these agents would live inside.
The Foundation Models Framework
The Foundation Models framework is the centerpiece. According to Apple’s WWDC26 developer documentation, the framework provides a unified Swift API that works with Apple Foundation Models running on-device and on Private Cloud Compute, as well as “any model provider with a Swift package conforming to the Language Model protocol.”
That protocol is the key architectural decision. Apple explicitly named Claude and Gemini as compatible providers in its WWDC26 Apple Intelligence guide, signaling that the framework is designed for multi-vendor model access from day one. WWDC26 Session 339, titled “Bring an LLM provider to the Foundation Models framework,” walks third-party model vendors through the integration process.
Three capabilities stand out for agent builders:
Dynamic Profiles allow developers to swap models, tools, and instructions within a continuous session. An app can start a task with Apple’s on-device model for speed, escalate to Claude for complex reasoning, and fall back to a smaller model for cost efficiency, all without breaking the conversation context. This is runtime model orchestration built into the OS.
Multimodal prompts let developers pass images alongside text so apps can reason about visual content. On-device Vision framework tools, including OCR and barcode readers, are available for models to call directly. The model can read a receipt, extract line items, and act on them without leaving the device.
The Evaluations framework provides testing infrastructure for AI features that goes “beyond what unit tests alone can catch,” according to Apple’s documentation. WWDC26 dedicated three sessions to evaluation: “Meet the Evaluations framework” (Session 298), “Create robust evaluations for agentic apps” (Session 299), and “Improve your prompts by hill-climbing with Evaluations” (Session 335). Building evaluation tooling into the platform SDK, rather than leaving it to third-party libraries, signals that Apple expects developers to build complex, multi-step agent workflows that need production-grade testing.
The Siri AI Rebuild
The framework powers a completely rebuilt Siri. Apple’s senior director of watchOS software engineering David Clark told TechRadar that the team “really wanted to make sure the Siri experience is a singular and consistent experience, whether I decide to ask Siri on my wrist a question, or whether I have my phone in my hand.” The old Siri had different capabilities on different devices. The new one runs the same model stack everywhere.
Joanna Stern tested the beta for a week and called it “good. Like good-good.” Her tests showed Siri pulling data from Messages, Calendar, and voicemail to generate personalized recommendations. When she asked for souvenir suggestions for her kids at the beach, Siri produced accurate, specific answers drawn from her message history. “AI is only as good as the data it has. And oh boy, does Apple have a lot of mine,” Stern wrote on X.
The personal context engine is where Apple’s agent play diverges from every other player. OpenClaw agents access tools and APIs. Enterprise platforms like Microsoft’s Scout connect to corporate data. Apple’s agents sit on top of a decade of personal data already on the device: messages, photos, location history, health data, purchase records. No other agent platform has that depth of context on consumer users.
App Intents: The Middleware Layer
The App Intents framework connects third-party apps to Siri AI through what Apple calls “schemas,” structures that let Siri understand app content without developers defining specific trigger phrases.
Entity schemas contribute app content to the Spotlight semantic index, which feeds Siri’s personal context understanding. Intent schemas let users take action on that content through natural language. The framework covers common app categories: task management, photo editing, communication.
The new View Annotations API adds on-screen awareness. Developers can map their UI views to entities, letting users reference and act on what’s visible on screen through conversation. “Show me more like this photo” or “add this item to my list” become possible without the app implementing bespoke voice commands.
Because schemas are system-defined, Apple says apps “benefit automatically from future improvements” to Siri’s language understanding, including expansion to new languages. This is a different model from API-first agent platforms, where developers write explicit tool definitions. Apple is asking developers to describe what their app does, then letting the system figure out when and how to invoke it.
The Small Business Subsidy
Apple added an economic incentive for smaller developers. If an app’s publisher is enrolled in the App Store Small Business Program and the app has fewer than 2 million total first-time downloads, the developer can access next-generation Apple Foundation Models running on Private Cloud Compute at no cloud API cost, according to Apple’s WWDC26 documentation.
This is effectively a free inference subsidy. For a startup building an AI-powered iOS app, the most expensive line item, model inference, drops to zero as long as they stay within Apple’s ecosystem and use Apple’s models. The 2 million download threshold is generous: it covers the vast majority of apps on the App Store.
The subsidy creates a strategic funnel. Small developers build on Apple’s models because inference is free. As their apps grow past the threshold, they’re already integrated with the Foundation Models framework. Switching to a third-party provider means paying for inference. Staying with Apple means negotiating enterprise terms. Either way, Apple’s framework is the runtime.
Agentic Architecture at the OS Level
WWDC26 dedicated multiple sessions to agentic development patterns. “Build agentic app experiences with the Foundation Models framework” (Session 242) covers multi-step agent workflows. “Run local agentic AI on the Mac using MLX” (Session 232) extends the pattern to Mac. “Xcode, agents, and you” (Session 259) introduces agentic coding within Apple’s IDE itself.
The word “agentic” appears repeatedly in official Apple developer materials for the first time. “Meet Core AI” (Session 324) introduces a new framework layer. “Debug and profile agentic app experiences with Instruments” (Session 243) provides performance profiling specifically for agent workloads.
Apple is shipping a full agent development stack: model access, tool calling, evaluation, profiling, and debugging. The stack is integrated into Xcode, runs on-device by default, and can escalate to Private Cloud Compute or third-party cloud models when local hardware is insufficient.
The Platform Play
FourWeekMBA’s analysis framed iOS 27 as hiding “a model marketplace in the beta.” That framing captures the competitive positioning: Apple is building the venue where model providers compete for access to Apple’s install base.
The pattern mirrors what Nvidia is doing with NemoClaw agent blueprints on top of OpenClaw, and what Microsoft built with Scout on top of the OpenClaw framework. In each case, the company that controls the runtime, orchestration, and distribution layer captures more value than the company that built the model. The Foundation Models framework is Apple’s version of that bet, applied to consumer devices rather than enterprise infrastructure.
The difference is scale. Nvidia sells to enterprises. Microsoft sells to organizations. Apple sells to individuals, 2 billion of them, who already carry the hardware. If the Foundation Models framework gains developer adoption at even a fraction of the rate that other Apple frameworks have historically achieved, every major LLM provider will need to ship a conforming Swift package or lose access to the largest consumer device ecosystem in the world.
What the Evaluation Stack Reveals
The three WWDC sessions dedicated to evaluation tooling reveal more about Apple’s ambitions than the model access APIs do. Companies don’t build evaluation frameworks for simple chatbot interactions. They build them when they expect developers to ship complex, multi-step, context-dependent agent behaviors that need to work reliably in production.
“Create robust evaluations for agentic apps” (Session 299) specifically targets agent workflows. The Evaluations framework tests behavior “across dynamic conditions” rather than static inputs, which suggests Apple expects developers to build apps where the model’s behavior changes based on user context, available tools, and runtime model selection through Dynamic Profiles.
This is infrastructure for a world where every app on your phone runs some form of agent: a photo app that automatically organizes based on personal context, a calendar app that negotiates meeting times across participants, a health app that correlates sensor data with behavior patterns. Apple is building the platform that makes it cheap, fast, and testable for 30 million registered Apple developers to build them instead.
The agent infrastructure race has a new entrant, and this one ships pre-installed on a billion phones.