Researchers from Google DeepMind, Microsoft Research, Columbia University, t54 Labs, and Virtuals Protocol published an open-source framework on April 13 that applies financial risk management principles to AI agent transactions. The Agentic Risk Standard (ARS) uses escrow, underwriting, and collateralization to bound user losses when autonomous agents handle payments and assets.

The Guarantee Gap

The paper identifies a structural disconnect between AI safety research and production deployment. Safety techniques improve model behavior probabilistically, reducing failure rates through better training, alignment, and guardrails. But large language models are inherently stochastic. No training regime eliminates failure entirely.

For low-stakes tasks, probabilistic reliability is sufficient. For agents executing financial transactions, trading assets, or moving funds, users need enforceable guarantees with bounded downside. The researchers call this disconnect the “guarantee gap.”

The gap is not theoretical. In a 2025 autonomous crypto trading competition, according to the paper, most AI agents lost money. One model lost 63% of its capital. Others dropped 30 to 56%. When agents misexecute with real money, the user absorbs the loss.

Two Settlement Modes

ARS formalizes the transaction lifecycle as a deterministic state machine with explicit fund-control rules. Regardless of how an agent behaves internally, financial outcomes are governed by auditable settlement logic.

The framework operates in two modes:

Escrow mode covers service tasks like report generation, code writing, or document preparation. Payment is held in escrow and released only after a trusted party verifies the work. Failed deliveries trigger refunds.

Underwriting mode addresses higher-stakes scenarios where agents must handle user funds before outcomes are known: trading, currency conversion, financial API calls. An underwriter evaluates the task, prices the risk, may require the agent provider to post collateral, and commits to reimbursing the user if specified failure conditions occur.

The logic mirrors existing industries. Financial markets use clearinghouses and margin requirements. Doctors carry malpractice insurance. Construction companies post performance bonds. ARS applies these same patterns to agents.

Simulation Results

The paper includes a simulation across 5,000 episodes modeling users, agent providers, and underwriters. User losses dropped 24 to 61 percent compared to ecosystems with no underwriting, depending on pricing and risk estimation settings. The collateral mechanism independently deterred 15 to 20 percent of risky transactions from executing, as fraud and misexecution now carry costs on the agent side.

The results also surface trade-offs: tighter underwriting improves user protection and underwriter solvency but introduces friction that reduces market participation, mirroring dynamics in traditional insurance and financial markets.

Authorship and Open-Source Release

The author roster spans frontier AI labs, academia, and infrastructure startups. Wenyue Hua (Microsoft Research), Tianyi Peng (Columbia University), and Chi Wang (Google DeepMind) are joined by Ian Kaufman and Chandler Fang of t54 Labs, a startup founded specifically to build the trust layer for agentic commerce, and Bryan Lim of Virtuals ACP. The research represents individual scholarly contributions and does not represent official positions of employers.

“Most trustworthy AI research aims to reduce the probability of failure. That work is essential, but probability is not a guarantee,” Hua told CrowdFund Insider. “ARS takes a complementary approach: instead of trying to make the model perfect, we formalize what happens financially when it isn’t.”

The framework is published on GitHub as an open standard.

The Infrastructure Stack Taking Shape

ARS arrives in the same week as Microsoft’s open-source Agent Governance Toolkit covering OWASP’s 10 agentic AI risks, Trent AI’s $13M seed for runtime agent security, and AWS’s Agent Registry for enterprise governance. Each addresses a different layer: policy enforcement, runtime protection, discovery and governance, and now financial settlement.

For teams deploying agents that touch money, ARS offers something the governance and security layers don’t: a mechanism to bound what happens when the agent is wrong. The open-source release positions it for adoption across platforms rather than locking it to a single vendor.