Google Agentic Data Cloud Rebuilds the Enterprise Data Stack for Agent-Scale Operations

Google Cloud unveiled the Agentic Data Cloud at Cloud Next 2026 on April 23, a three-pillar architecture that replaces the traditional enterprise data stack, built for humans running scheduled queries, with infrastructure purpose-built for autonomous AI agents operating around the clock. The announcement represents the most significant architectural repositioning of Google’s data platform since the launch of BigLake in 2022.

“The data architecture has to change now,” Andi Gutmans, VP and GM of Data Cloud at Google Cloud, told VentureBeat. “We’re moving from human scale to agent scale.”

The Three Pillars

The Agentic Data Cloud rests on three components: a Knowledge Catalog for semantic context, a cross-cloud lakehouse for data access, and a Data Agent Kit for developer tooling.

Knowledge Catalog. This is an evolution of Google’s existing Dataplex governance product, rebuilt with a materially different architecture. Where traditional data catalogs required data stewards to manually label tables, define business terms, and build glossaries, the Knowledge Catalog automates that process using AI agents. According to Google’s announcement, the catalog covers BigQuery, Spanner, AlloyDB, and Cloud SQL natively, and federates with third-party catalogs including Collibra, Atlan, and Datahub. Zero-copy federation pulls semantic context from SaaS applications like SAP, Salesforce Data360, ServiceNow, and Workday without requiring data movement.

The practical implication: agents can query enterprise data using shared definitions of business terms, not just raw table schemas. Gutmans told VentureBeat that manually curated catalogs cannot scale to agent query volumes. “We need to make sure that all of enterprise data can be activated with AI, that includes both structured and unstructured data,” he said.

Cross-cloud lakehouse. BigQuery can now query Apache Iceberg tables sitting on Amazon S3 via Google’s Cross-Cloud Interconnect, a dedicated private networking layer, with no egress fees. Google claims price-performance comparable to native AWS warehouses. Bidirectional federation in preview extends to Databricks Unity Catalog on S3, Snowflake Polaris, and the AWS Glue Data Catalog using the open Iceberg REST Catalog standard.

Data Agent Kit. A portable suite of MCP tools and IDE extensions that drops into VS Code, Claude Code, Gemini CLI, and Codex. Rather than writing Spark pipelines to move data between sources, data engineers describe outcomes, and the agent selects the appropriate framework (dbt, Apache Spark, Apache Airflow) and generates production-ready code. Three specialized agents ship at GA: a Data Engineering Agent for pipeline transformations, a Data Science Agent for model lifecycle automation, and a Database Observability Agent for infrastructure diagnostics.

Why the Data Layer Became the Bottleneck

For 50 years, enterprise databases had one job: store data and return exact results on demand. AI agents break that contract. According to Sailesh Krishnamurthy, VP of Engineering for Databases at Google Cloud, today’s agent applications need “the best results, not just exact ones,” requiring graph traversal, vector embeddings, full-text search, and relational operations to coexist in a single system rather than forcing costly data movement.

“When you have this opportunity to look at data as a graph, look at data with vector embeddings, do semantic search or full-text search, all of a sudden, it’s not about getting the exact results, but getting the best results and the best quality,” Krishnamurthy told SiliconANGLE.

The core problem Google is addressing: agents operating 24/7 across an enterprise generate query volumes that overwhelm manually curated metadata systems. Without semantic context, agents hallucinate or make expensive errors because they lack understanding of what business terms actually mean within a specific organization. A data catalog that covers only the curated subset a small team can maintain by hand becomes a liability at agent scale.

Google also announced Spanner Omni alongside the Agentic Data Cloud, a downloadable edition of its globally distributed database that runs on-premises or across rival clouds. Agentic migration tooling powered by Gemini compresses database migration timelines from months to weeks by handling not just schema and data but the application layer, including embedded SQL queries. “Today, with the power of Gemini, we are excited that people are able to migrate their whole application stack so much faster,” Krishnamurthy said.

Production Deployments Already Running

Three enterprise deployments illustrate the scale Google is targeting. According to the Google Cloud blog:

Vodafone has launched hundreds of agents on the platform to deliver uninterrupted customer service, with projected savings of millions of euros annually.
American Express is migrating a core on-premises data warehouse and hundreds of production applications to BigQuery to power what Google calls “trusted agentic commerce at scale.”
Virgin Voyages runs over 1,000 specialized AI agents, including one that reduced mass itinerary rebooking time from six hours to 11 minutes.

These are not pilot programs. Virgin Voyages’ 1,000-agent deployment and American Express’s warehouse migration represent production workloads where the data infrastructure layer directly determines whether agents can operate reliably.

How Competitors Are Positioning

The premise that agents require semantic context, not just data access, is shared across the market. Databricks has Unity Catalog for governance and semantic layering across its lakehouse. Snowflake has Cortex for AI and semantic context. Microsoft Fabric includes a semantic model layer built for business intelligence and, increasingly, agent grounding.

The disagreement is over who builds and maintains those semantics. Google’s bet is on automated curation. “Our goal is just to get all the semantics you can get,” Gutmans told VentureBeat, noting that Google will federate with third-party semantic models rather than require customers to start over.

Multicloud interoperability is the second competitive axis. With 84% of cloud leaders intentionally selecting multiple clouds according to Kyndryl’s 2025 Cloud Readiness Report, the Agentic Data Cloud’s ability to query across AWS and (eventually) Azure without egress fees targets a structural pain point. Google and AWS launched a collaboration in December 2025 to simplify multicloud connections, with Microsoft Azure planning to join later in 2026.

Cross-platform integrations are accelerating across the industry. Databricks expanded cloud partnerships with Microsoft and Google Cloud last year. Snowflake restructured its data cloud to support enterprise AI agent adoption, connecting to Google Drive and Salesforce Data Cloud. CoreWeave recently collaborated with Google Cloud to enable AI training and inference across their platforms, as CIO Dive reported.

The MCP Bet

A technical detail that carries strategic weight: Google has fully embraced the Model Context Protocol as the interface layer between agents and data assets. MCP servers are now available for BigQuery, Spanner, AlloyDB, Cloud SQL, and Looker, with agent interactions governed by existing IAM policies, VPC Service Controls, and data residency requirements.

This means any MCP-compatible agent, regardless of which model provider built it, can discover and use data assets in Google Cloud. The Data Agent Kit ships with the same skills and tools that power Google’s own first-party agents, including the Deep Research Agent that performs multi-step reasoning across BigQuery, internal documents, and web assets.

The decision to use an open protocol rather than a proprietary interface is notable. It positions Google’s data layer as a neutral substrate for agents built by Anthropic, OpenAI, or open-source communities, not just Gemini-powered ones. Whether that openness survives competitive pressure remains to be seen.

The Strategic Calculation

Dion Hinchcliffe, VP and practice lead at The Futurum Group, framed the strategic logic succinctly: “Google is betting that whoever owns the data context layer for agents will control enterprise automation outcomes. If Google succeeds, it doesn’t need to win the entire cloud stack. It just needs to become the system where enterprise data is understood well enough for their agents to act better than other agents, which is in fact the highest-value layer in an agentic enterprise.”

That framing clarifies why this announcement matters more than a typical platform update. Google is conceding that it may not win the compute layer (AWS dominates infrastructure) or the application layer (Microsoft owns the enterprise workflow surface area). The Agentic Data Cloud is a bet on a third path: if your data context layer becomes the substrate agents depend on for accuracy, you capture the control point regardless of where the models or applications live.

The risk is execution. Automated semantic curation sounds elegant in a keynote, but enterprise data estates are messy. Whether the Knowledge Catalog can actually infer business logic from query logs across heterogeneous systems, without producing errors that propagate through agent decisions at scale, is an open question. The cross-cloud lakehouse’s zero-egress-fee promise also depends on sustained partnership with AWS, which has its own reasons to keep data gravity within its ecosystem.

For infrastructure teams evaluating their data stack for agent workloads, Google’s announcement surfaces three concrete questions. First, can your current data catalog scale to agent-level query volumes without manual stewardship? Second, are cross-cloud egress fees silently taxing your agent operations? Third, are your data engineers still writing pipelines, or have they shifted to outcome-based orchestration? The answers will determine whether the transition from human-scale to agent-scale data infrastructure happens deliberately or by crisis.

Google Agentic Data Cloud Rebuilds the Enterprise Data Stack for Agent-Scale Operations

The Three Pillars

Why the Data Layer Became the Bottleneck

Production Deployments Already Running

How Competitors Are Positioning

The MCP Bet

The Strategic Calculation

Get our morning briefing in your inbox

Keep Reading

CLI-Anything Exposes a Structural Blind Spot: No Security Scanner Can Detect Malicious AI Agent Instructions

Microsoft Agent 365 Reaches General Availability With OpenClaw Detection, Shadow AI Controls, and Cross-Cloud Agent Governance

OpenAI Turns ChatGPT Into the Billing Layer for 3.2 Million OpenClaw Users. Anthropic Shut the Same Door a Month Ago.