AI Token Spend Is Becoming a Line Item on Engineering Compensation: A CFO's Framework for Governing Agent Costs

Anand Murugan, founder and CEO of Blackbee AI, published a five-layer governance framework in Forbes Finance Council for CFOs trying to get ahead of AI token spend before it becomes unmanageable. The piece opens with an example that should concern any finance team: an engineer at a sports technology company was driving $600,000 per year in token spend across 40 different AI models, a fact no one in finance or engineering discovered until a third-party audit surfaced it.

The framework matters because token cost governance is the gap between “we deployed AI agents” and “we understand what our AI agents cost per business outcome.”

The Problem: Tokens Break Traditional Finance Controls

CFOs have two decades of levers for controlling technology spend: licenses, seats, headcount, infrastructure capacity, depreciation schedules. AI tokens conform to none of them, according to Murugan’s analysis.

A single token costs fractions of a cent. But enterprise users generate three to five iterations per task, and agentic workflows spawn sub-agents that consume thousands of tokens per request without a human in the loop. The result is a cost curve that behaves like cloud compute in 2017: variable, distributed, and invisible until the bill arrives.

Average monthly enterprise AI spend grew 36% year over year between 2024 and 2025, from roughly $63,000 to $85,500, according to CloudZero’s State of AI Costs report. Deloitte notes that “unmanaged token growth can introduce material operational and financial risk just as more advanced reasoning models take hold.”

Most AI spend today appears as a lump-sum API bill or, worse, embedded inside an existing SaaS or cloud invoice. There is no procurement gate. There is no headcount approval. The cost scales with usage patterns that no one is tracking at the workflow level.

The Five Layers

Murugan’s framework maps directly from cloud FinOps to token governance:

1. Visibility before control. Instrument every AI call with metadata: which model, which workflow, which team, which use case. This is cost tagging for tokens. Without it, nothing else works.

2. Use-case attribution. Raw token counts are meaningless without context. The metric that matters is cost per business outcome: cost per resolved support ticket, cost per closed invoice, cost per generated lead. Murugan calls this the “agentic work unit” layer. It reframes the conversation from “how much are we spending?” to “what is each AI dollar producing?”

3. Tiered budgets. Not every task requires a frontier model. Build budgets around three tiers: premium models for complex reasoning, mid-range for standard tasks, small or open-source models for high-volume routine work. Force every AI initiative to declare its tier and justify premium use.

4. Engineering chargebacks. Push token cost accountability to the engineering and product leaders who control the design choices driving it: prompt length, context window size, retry logic, agent loop depth. Once engineering owns the bill, prompt optimization and caching become standard practice, not optional improvements.

5. Outcome-linked metrics. Every AI deployment ships with a defined business metric and a horizon: hours saved, error rate reduction, throughput per employee, revenue per agent. Tie spend to those metrics quarterly. Kill workflows that don’t deliver. Scale the ones that do.

The Compensation Question

The most provocative section of Murugan’s piece addresses a conversation he argues finance teams will face within 12 months. Nvidia CEO Jensen Huang has suggested that an engineer earning $500,000 should be using $250,000 in AI tokens annually. Venture capitalist Tomasz Tunguz has observed technology companies “already adding inference costs as a fourth component to engineering compensation,” alongside salary, benefits, and equity, as reported by TechCrunch.

If that trend holds, total cost-to-company calculations need a token line modeled by role and seniority. Headcount planning must account for compute productivity multipliers. Capacity forecasting becomes a joint exercise between finance, engineering, and procurement, with token commits negotiated alongside cloud commits.

The Agent Governance Connection

This framework intersects directly with the broader enterprise agent governance challenge. NCT has covered Microsoft and Uber reporting that AI implementation costs exceeded original labor-replacement budgets, with Uber burning its 2026 AI budget in four months. The pattern Murugan describes is the same: without unit economics at the workflow level, AI spending optimizes for activity, not outcomes.

For teams deploying agents through OpenClaw, Hermes, or any other framework, the five-layer model suggests a practical checklist. Can you tag every agent call by model, workflow, and use case? Can you measure cost per business outcome, not just cost per token? Can you route tasks to the cheapest model that handles them adequately? If the answer to any of those is no, the finance team is flying blind on a cost category that scales automatically and invisibly.

AI Token Spend Is Becoming a Line Item on Engineering Compensation: A CFO's Framework for Governing Agent Costs

The Problem: Tokens Break Traditional Finance Controls

The Five Layers

The Compensation Question

The Agent Governance Connection

Get our morning briefing in your inbox

Keep Reading

Ten AI Agent Frameworks Tested, Zero Convergence Found: The Case for Managed Platforms Over DIY Orchestration

OpenClaw vs CraftBot: The Local AI Agent Market Splits Into Two Architectures

Hermes vs OpenClaw: Enterprise Teams Now Face a Fundamental Architecture Choice Between Speed and Isolation