When you buy electricity, you buy kilowatt-hours. When you buy bandwidth, you buy bits-per-second. When you buy storage, you buy gigabytes. When you buy AI intelligence in 2026, you buy tokens, which are easy to meter and easy to price, but hard to value. This sounds like a technical detail. It isn’t. It may become the most important AI procurement question enterprises face over the next 18 months.
What You Actually Buy
The three largest AI providers in the world publish their prices in tokens. A token is a model-specific unit of text representation. Different models tokenize identical content differently, which makes token counts a poor proxy for business value. Anthropic charges $5 per million input tokens for Claude Opus 4.8. OpenAI charges $5 per million input tokens for GPT-5.5. Google undercuts both at $2 per million for Gemini 3 Pro. On the published rate card, the two American flagships are priced in lockstep and Google looks like the bargain of the category.
Gemini looks less expensive to use. But tokens are an artifact of how transformer models bill internally, with no inherent relationship to the value the buyer extracts. Two identical tasks on two identical inputs can use wildly different token counts depending on the model’s reasoning style, prompt processing, tool calls, and verbosity defaults.
Anthropic says the new tokenizer in Opus 4.8 can use up to 35% more tokens for the same fixed text than earlier Opus models. That means the effective cost of the same job can rise by as much as a third without the published price changing by a penny. A Gemini task may run through twice as many tokens as a Claude task and arrive at the same total bill. The token price tells you almost nothing about the actual transaction. This would be a minor accounting problem if everyone agreed on a better unit.
The Candidate Units
The candidate units of intelligence currently in circulation are, roughly:
Tokens. What the API meter shows. Technically precise. Tokens are economically meaningful to the seller but only indirectly meaningful to the buyer for the reasons stated above. Favored by foundation model companies because tokens let sellers advertise plunging prices for capability they were not actually pricing in the first place.
Compute, measured in FLOPs or GPU-hours. What it costs to run the model. The unit Nvidia loves, because Nvidia sells the substrate. Useful for capacity planning, useless for procurement.
Task-completion length, measured by something like METR’s benchmark. What the model can actually do, expressed in how long it would take a competent human. METR’s data shows the longest task an AI can reliably complete is doubling every four months, down from every seven a year ago. Arguably the most economically meaningful unit currently available. It is also the hardest to standardize across vendors, which is why no vendor has rushed to adopt it.
Agent-hours or completed agentic tasks. What an autonomous AI system finishes on your behalf, measured in business outcomes rather than model calls. The unit Anthropic and OpenAI are starting to position around, because it lets them sell solutions rather than API calls. It is also the unit Salesforce, ServiceNow, and Microsoft want to own, because they already sell business outcomes by the seat and have decades of experience charging for them.
Intelligence-per-watt. What you get per kilowatt-hour of inference energy. The Nvidia and hyperscaler unit. This one matters more than people realize when the marginal cost of intelligence converges on the marginal cost of electricity, which is Sam Altman’s well-known prediction.
Each candidate unit advantages a different vendor. Each vendor knows this. The competition to define which unit becomes the industry standard is more strategically important than the competition to ship the next model release. Models obsolete in quarters. Units, once adopted, last decades.
Whose Unit Is It?
AI vendors sell tokens, agent runs, context windows, and benchmark scores. Enterprises buy contracts reviewed, tickets resolved, campaigns launched, code shipped, and revenue generated.
A CIO does not care how many tokens a task consumed. A CIO cares whether the task got done, how well it was done, and what it cost compared to the alternative.
That distinction is where the real unit war will be fought. Every vendor wants the market measured in the units that make its products look most valuable. Every buyer wants the market measured in the units that map directly to business outcomes.
The company that successfully defines the buyer’s unit will have far more pricing power than the company that merely defines the technical one.
The Unit Will Define the Market
In every previous infrastructure category, the company that defined the unit captured the market for a generation. Large-scale electricity markets accelerated after standardized metering and billing units emerged. Network bandwidth standardized on the megabit. Amazon turned compute into a broadly consumable utility product when it standardized on selling EC2 hours. The unit is the focus of the contract.
The AI unit war is already underway. Anthropic recently disclosed that a substantial majority of code merged into some internal repositories was AI-generated. Most people heard a capability story. I heard a unit story. The implicit benchmark is “lines of merged code per engineer per day.” If that becomes an industry-standard productivity unit, every other vendor will have to match the metric or argue against it, and both outcomes favor Anthropic.
OpenAI’s push around agents that can handle multi-hour tasks is a unit move. If “completed agentic task” becomes the unit, OpenAI’s investment in orchestration becomes the moat. Google’s integration of AI Mode into Search is a unit move. If “answer engine query” becomes the unit, Google’s distribution becomes the moat. Salesforce’s Agentforce pricing is a unit move. Microsoft’s Copilot-per-seat pricing is a unit move. Nvidia’s intelligence-per-watt benchmarking is a unit move.
Six different units, six different attempts to measure value, each uniquely beneficial to their respective creators. None of them are necessarily wrong. They are simply optimized for different buyers, sellers, and economic incentives.
Defining Your Unit
We’ve been working with our clients to define a Price Per Intelligence Unit and using that definition to build a way to evaluate costs on an apples-to-apples basis. Not in the abstract. Concretely:
- A marketing brief processed end-to-end with brand-aligned output?
- A customer service ticket resolved without human escalation?
- A legal contract reviewed for ten specific risk flags?
- A code review completed at a defined quality bar?
- A sales lead qualified against your eight-point rubric?
You can do this yourself. Pick the unit that maps to a job your business actually performs. Measure it in dollars per unit today, against the human-only baseline. Track the trajectory. The price per intelligence unit, on your metric, is falling fast, probably faster than your planning models assume. The compounding effect across a two-year capital cycle is enormous, and it will probably change the answer to several “build vs. buy” decisions on your roadmap.
When vendor evaluations finally happen, compare vendors on your unit. Make them justify their pricing inside your framework. The conversation changes when the buyer brings the unit definition. You stop being a tokens customer and start being a procurement counterparty the vendor has to negotiate with on your terms.
The Bigger Picture
The price per intelligence unit is collapsing across every reasonable definition of the unit. Eight times more code per day. Four-month task-length doubling. Four times cheaper per token year over year on roughly equivalent capability. Pick any unit and the trajectory is the same: down and to the right, on a curve few commodities have ever traced before. This is the deflation that ate the cost of computation in the 1990s, the cost of storage in the 2000s, and the cost of bandwidth in the 2010s, happening faster and across more dimensions simultaneously.
Model quality benchmarks are the measurements getting the big headlines, but a universally agreed-upon unit definition is going to be the key to quantifying the enterprise value of AI. Start by defining your own.
Author’s note: This is not a sponsored post. I am the author of this article and it expresses my own opinions. I am not, nor is my company, receiving compensation for it. This work was created with the assistance of various generative AI models.