Salesforce has published a 12-rule framework for enterprise agentic AI deployments, drawing on observations from more than 20,000 production agent deployments. The framework was developed by John Taschek, Salesforce’s executive vice president and chief market strategy officer, and reported by ZDNET.
The central thesis is blunt: most agentic AI pilots that fail are not failing because of the AI. They are failing because of architecture.
The Core Finding
Salesforce identified three recurring mistakes across its deployment base. First, teams over-rely on language models to handle tasks that should be governed by deterministic rules. Legal compliance, financial guardrails, and safety constraints should be hard-coded, not left to probabilistic model outputs. Second, enterprises consistently underinvest in context engineering, feeding agents messy, siloed, or stale data and expecting good results. Third, adversarial testing is treated as a launch-day checkbox rather than an ongoing discipline.
The most counterintuitive insight, according to ZDNET’s reporting, is the difference in work distribution between traditional software and agents. With traditional software, roughly 90% of the engineering work is complete before launch. With AI agents, 90% of the work comes after deployment, in managing, monitoring, and improving agent behavior in production.
The 12 Rules
Taschek modeled the framework on computer scientist Edgar F. Codd’s 12 Rules for relational database management systems, published in 1985. The agentic AI rules are organized into four layers:
Foundation (data and context): Every piece of data feeding an agent must have traceable lineage. Agents must operate on live data, not stale snapshots. Semantic metadata, where terms like “at-risk customer” are formally defined rather than guessed by the model, is required.
Core (agency): Every agent decision must be logged and explainable. Continuous adversarial testing is mandatory, not optional. Agents must decompose complex goals into steps and adapt when conditions change. Legal, financial, and safety rules must be architecturally enforced, not prompt-engineered.
Operations (work): Agents from different vendors must coordinate without custom integration for every pairing. Human-agent handoffs must include full context. The enterprise retains control over data residency, model selection, and access policies. Agent performance is measured by business outcomes, not task completion counts.
Apex (trust): The highest-weighted rule. Agents earn the right to act through fairness testing across protected groups, toxicity screening, consent enforcement, hallucination prevention, and explainability. Vendor accountability for agent failures must be pre-assigned, not negotiated after incidents.
Why It Matters for Builders
The framework arrives as enterprise agentic AI spending faces its first serious scrutiny. Earlier this week, NCT reported on a broader enterprise token spending backlash, with corporations implementing token budgets and agent loop oversight for the first time.
Salesforce’s data adds specificity to that trend. The company reports that more than half of US desk workers consider themselves AI skeptics, according to its own workforce survey. The top three reasons workers cite for unsuccessful AI pilots: generic outputs, insufficient training, and low trust in outputs. Separate research from Informatica found that more than half of agentic AI adopters cite data quality and retrieval issues as deployment barriers.
For engineering teams evaluating agentic AI architecture, the practical takeaway from Salesforce’s 20,000-deployment dataset is that model selection is a secondary concern. Data lineage, governance automation, and continuous adversarial testing are the variables that determine whether a pilot reaches production or stalls.