Original Thought Paper · V2

The Correct Economics of AI

From Token Consumption to Revenue Flywheel: Value Density, Management Misjudgment, and Cloud-Edge Triage in the AI Industry

Correct AI Economics: From Token Consumption to Revenue Flywheel

LEECHO Global AI Research Lab & GPT 5.5

May 11, 2026
Language: English
Version: V2

TOKEN VALUE DENSITY · REVENUE FLYWHEEL · AI FORMALISM

Abstract & Methodology

This paper argues that the core question of AI economics is not whether models can generate content, nor whether users are willing to pay for AI, but whether tokens can be converted into verifiable, reusable, deliverable, and monetizable economic value.

Over the past three years, two facts have simultaneously emerged in the AI industry: on one hand, hardware, storage, cloud infrastructure, and some AI application companies have already earned real revenue; on the other hand, a large number of enterprise projects have yet to enter the income statement, and a large volume of consumer-side AI output has failed to demonstrate sustainable monetization capability. This reveals that the AI industry’s problem is not “no revenue,” but that revenue, cost, output, and reinvestment have not yet universally formed a healthy revenue loop.

Methodological Note: This paper is a thought paper formed through public data, industry observation, and conversational reasoning — not an experimental paper, audit report, or financial forecast. External data serves to support trend assessments and conceptual frameworks. The core contributions lie in proposing analytical concepts including “Token Value Density,” “Effective Token Cost,” “AI Formalism,” and “Task-Value-Density-Based Token Triage.”

Core thesis: Tokens are fuel consumption, not mileage; prompts are actions, not outcomes; AI usage rate is process, not value. Only tokens that enter a revenue loop constitute genuine productivity.

Chapter 1: Framing the Problem — AI’s Crisis Is Not in Models but in the Revenue Loop

The most easily misread aspect of the AI industry is equating “model capability improvement,” “user growth,” “token consumption increases,” and “enterprise procurement expansion” directly with economic value creation. In fact, these metrics only demonstrate that AI is being used, experimented with, and purchased — they do not prove that AI has formed a sustainable revenue loop.

A genuine revenue loop should contain five stages: users invest cost, AI produces verifiable output, output enters business or market systems, users achieve revenue growth or cost reduction, and users continue to expand investment. If any stage fractures, AI degrades from a production tool to a consumption tool, or even to a cost black hole.

InvestmentSubscription, API fees, learning costs, process redesign

→

OutputCode, documents, customer service, analysis, Agent actions

→

VerificationTestable, auditable, deliverable, reusable

→

ReturnsRevenue growth, cost reduction, cycle compression

→

ReinvestmentContinue purchasing, integrating, scaling, optimizing

This paper calls this the first question of AI economics: Can tokens transform from cost units into value units?

Chapter 2: Hardware-Side Certainty of Returns vs. Application-Side Uncertainty of Value

In this AI cycle, the first entities to achieve certain cash flows were not end users, but hardware, storage, packaging, data center, power, and cloud infrastructure companies. They sell deterministic compute and infrastructure, while AI application companies sell future productivity expectations.

Layer	External Data	Economic Implication
GPU / Accelerated Computing	NVIDIA FY2026 full-year revenue of $215.9B, up 65% YoY; Q4 data center revenue of $62.3B, up 75% YoY.	AI CapEx has directly converted into hardware revenue.
Storage / HBM	Micron FY2026 Q1 set multiple revenue records; SK hynix 2025 sales and operating profit reached all-time highs.	AI server and HBM demand has made storage companies direct beneficiaries.
Power / Data Centers	Large-scale AI training and inference are driving construction of data centers, power systems, cooling, and network equipment.	Even if some applications fail, infrastructure may persist as the long-term digital economy foundation.

However, the hardware side achieving cash flow does not mean all AI infrastructure investment will yield long-term returns. If application-layer ROI cannot be continuously validated, some GPU, storage, data center, and power projects may face cyclical overcapacity.

Hardware companies sell deterministic compute; AI application companies sell productivity expectations. The former has already delivered; the latter is still proving its case.

Chapter 3: Application-Layer Revenue Is Real, but the Revenue Loop Has Not Been Universally Proven

AI application revenue is not fictitious. OpenAI officially disclosed ARR growing from approximately $2B in 2023, to approximately $6B in 2024, to over $20B in 2025. Menlo Ventures estimates 2025 enterprise GenAI spending at $37B, with approximately $19B at the application layer and approximately $18B at infrastructure.

These figures demonstrate that AI services can be sold and that both enterprises and consumers are willing to purchase AI. However, they still do not prove that the content, code, reports, and Agent behaviors users produce with AI have universally formed revenue loops.

Real revenue does not equal a real revenue loop. Enterprises willing to buy AI does not mean AI has entered the income statement; consumers willing to subscribe to AI does not mean consumers can sustainably monetize AI output.

Anthropic / Claude’s growth better illustrates the direction of high-value workflows. Multiple media and market reports indicate that Anthropic’s run-rate revenue reached the $30B level during April–May 2026, with some reports suggesting it has surpassed OpenAI on an annualized basis. However, this paper does not present this as audit-confirmed financial fact, but rather as an industry signal of rapid growth in high-value enterprise and developer scenarios.

Chapter 4: The Consumer Paradox — Users Buy AI, but AI Output Cannot Be Sold

The consumer side is not lacking in demand. Sensor Tower data shows that in H1 2025, generative AI app downloads globally approached 1.7 billion, with in-app purchase revenue approaching $1.9B; ChatGPT continues to command a significant share of consumer-side revenue.

However, consumer spending data proves that “AI services can be sold,” not that “AI output can be sold.” A user willing to pay subscription fees for AI only demonstrates that AI has consumer value as a tool, entertainment, companion, search, or writing assistant; it does not prove that articles, images, videos, code, scripts, presentations, and social content produced through AI can be sustainably sold on the market.

PROVEN

Consumers Are Willing to Buy AI Services

Subscriptions, in-app purchases, mobile usage, daily Q&A, and writing needs have been validated by the market.

NOT UNIVERSALLY PROVEN

Consumers Can Sell AI Output

AI-generated content supply has surged, homogenization is severe, per-unit prices are declining, and monetization evidence remains insufficient.

The real crisis on the consumer side is not a lack of usage, but the disconnect between usage volume and monetization capability. High volumes of low-value, high-frequency interactions can generate engagement, but cannot form sustainable profit.

The consumer side is not inherently low-value; what is low-value is interaction that cannot be verified, reused, delivered, or monetized — yet continuously consumes tokens.

Chapter 5: The B2B ROI Gap — From Adoption Rate to Income Statement

The problem for B2B enterprise users is not “not buying AI,” but “whether AI enters the income statement after purchase.” The widely cited conclusion from MIT-related research is that most enterprise GenAI projects have not produced measurable P&L impact. Data from McKinsey, Deloitte, and Gartner also shows that while AI usage is widespread, measurable ROI, EBIT impact, and Agent success rates remain limited.

External Signal	Data / Assessment	Significance for This Paper
MIT / Enterprise GenAI Projects	A large number of enterprise projects have not produced measurable P&L impact.	Demonstrates “high adoption” ≠ “high implementation success.”
McKinsey	Only some enterprises report AI impact on EBIT, and many impacts are limited in magnitude.	Demonstrates a persistent gap between AI usage and income statement impact.
Deloitte	The share of significantly measurable ROI remains limited; Agentic AI even more so.	Demonstrates Agents are still in value-validation stage.
Gartner	Predicts over 40% of Agentic AI projects will be canceled by end of 2027, citing high costs, unclear business value, and insufficient risk controls.	Demonstrates “going agentic” ≠ automatic success.

Enterprise AI project failures are often not because the model is completely unusable, but because the organization has not placed AI within a measurable business loop. Meeting summaries, presentations, knowledge base Q&A, and internal assistants can boost usage metrics, but unless they reduce headcount costs, improve sales conversion, shorten R&D cycles, or lower customer service costs, they are unlikely to become a sustained budget line item.

Chapter 6: Token Value Density Theory

V2’s core theory is Token Value Density. It distinguishes between “consuming many tokens” and “creating much value.”

Definition 1: Token Value Density

Token Value Density is the verifiable, reusable, deliverable, and monetizable value produced per unit of tokens.

Token Value Density = Verifiable Economic Value ÷ Total Effective Token Cost

This is not a precise financial formula but an economics concept within a thought paper. It reminds us that the same 1 million tokens, when applied to production system fault repair, code migration, customer service automation, financial risk management, and scientific research, may yield high value; when applied to low-quality articles, repetitive images, ineffective Agent trial-and-error, and AI Slop, they may yield almost no value — or even negative value.

High Value / Low Cost

Efficient small models, local automation, specialized tools, structured tasks.

High Value / High Cost

Enterprise Agents, complex code, finance, legal, research, production systems.

Low Value / Low Cost

Social, companionship, light Q&A, tone rewriting, low-risk entertainment.

Low Value / High Cost

AI Slop, Tokenmaxxing, ineffective Agents, chaotic prompt trial-and-error.

The correct economics of AI is not about maximizing total token volume, but about minimizing per-outcome token cost and maximizing Token Value Density.

Chapter 7: Effective Token Cost

Definition 2: Effective Token Cost

Effective Token Cost is not the API list price, but the total cost of tokens, human verification, rework, tool invocations, context management, failed trial-and-error, and risk review required to complete one deliverable outcome.

AI platforms frequently advertise declining per-token prices, but what users actually purchase is not tokens — it is outcomes. A single deliverable outcome often involves multi-turn conversations, long context ingestion, retrieval, tool invocations, Agent planning, code execution, error correction, human review, and regeneration.

List-Price TokensAPI unit price for input and output

＋

Hidden TokensLong context, multi-turn corrections, tool calls

＋

Human CostVerification, review, testing, compliance

＋

Failure CostRework, misdirection, bugs, opportunity cost

Therefore, declining per-token prices do not necessarily mean declining AI usage costs. If task complexity, verification costs, and rework iterations rise in tandem, the total cost of completing a usable outcome may actually increase.

Chapter 8: AI Formalism — Mistaking Fuel Consumption for Mileage

Definition 3: AI Formalism

AI Formalism is the management error of substituting process metrics — such as token consumption, prompt count, AI usage rate, AI code share, and Agent run count — for business outcome metrics.

After April 2026, the engineering community began discussing “tokenmaxxing”: enterprises, teams, or individuals deliberately increasing token consumption, prompt counts, and Agent invocations to appear AI-native, without necessarily producing commensurate productivity gains. Business Insider reported Jellyfish’s finding: the top 10% of AI users consume approximately 10× the tokens, yet productivity is only approximately 2×.

This demonstrates precisely that tokens are cost metrics, adoption metrics, and risk-monitoring metrics — not output metrics, capability metrics, or performance metrics.

Wrong Process Metric	What It Looks Like	What It May Actually Be	Should Be Replaced With
Token consumption	Active AI usage	Inefficient trial-and-error, prompt chaos, cost inflation	Per-outcome token cost
Prompt count	AI-native employees	Poor problem decomposition, repetitive queries	Task completion rate, delivery cycle
AI code share	Improved code production	Review burden, bugs, rising tech debt	Defect rate, rollback rate, maintainability
Agent run count	High automation	Runaway agents, wasted compute, no acceptance	Closed-loop completion rate
AI tool coverage	Successful transformation	Formalistic box-checking	EBIT, cost reduction, revenue growth

Tokens are fuel consumption, not mileage; prompts are actions, not outcomes; AI usage rate is process, not value.

Chapter 9: Digital Waste and the Negative Externalities of AI Slop

Definition 4: Digital Waste

Digital Waste is AI output that cannot be verified, cannot be reused, cannot be delivered, and cannot be monetized — yet consumes tokens, human attention, review costs, and trust systems.

The problem with AI Slop is not merely low-quality content, but that it transfers the low cost of the generation side into high cost on the verification side. Low-quality vulnerability reports, auto-generated documentation, invalid analyses, pseudo-code, marketing spam, and homogenized content all force humans to spend time filtering, fact-checking, deleting, and repairing.

HBR / BetterUp Labs / Stanford research on workslop indicates that AI-generated low-quality work content creates rework burdens for colleagues; RedMonk’s observations on the open-source ecosystem also show that AI-generated vulnerability reports are polluting professional collaboration systems.

Low-cost generation does not equal low social cost. The true cost of AI Slop lies not on the generation side, but on the verification side, the filtering side, and the trust system side.

Therefore, if AI output cannot be verified, reused, and delivered, it is not merely “valueless” — it may carry negative value.

Chapter 10: Token Triage and Cloud-Edge Synergy

The key to solving AI economic mismatch is not routing all tasks to the most powerful model, but matching each task type to the appropriate cost structure. This paper calls this “Task-Value-Density-Based Token Triage.”

Definition 5: Token Triage

Token Triage is the economic method of allocating different AI demands to local models, specialized small models, cloud general models, or cloud frontier reasoning models based on task value density, verifiability, reusability, risk level, and latency requirements.

Apple Intelligence, Microsoft Copilot+ PC, and on-device NPUs already demonstrate that some everyday interactions can be completed on local models or edge devices. Apple’s on-device models and Private Cloud Compute form a cloud-edge synergy; Microsoft Copilot+ PC also emphasizes local NPU processing for low-latency, privacy-sensitive, or high-frequency tasks.

Task Type	Value Density	Recommended Model Structure	Economic Purpose
Enterprise code, Agents, finance, legal, research	High	Cloud frontier model + tool calls + audit system	Maximize verifiable ROI
High-value personal production tasks	Medium-high	Cloud strong model + local toolchain	Serve developers, creators, consultants
Social, companionship, tone rewriting, light Q&A	Medium-low	Local / on-device / lightweight cloud model	Reduce marginal cost, improve privacy
Low-value enterprise formalism tasks	Low or negative	Restrict, audit, or cancel	Prevent Tokenmaxxing and AI formalism

Local models are not zero-cost. They convert ongoing cloud inference costs into device depreciation, chip capability, on-device model updates, and security governance costs. Their advantage lies in the reduced marginal cost of high-frequency lightweight tasks.

The true triage criterion is not B2B vs. consumer identity, but task value density.

Chapter 11: GPT vs. Claude Route Divergence

The competition between GPT and Claude should not be understood solely as model capability competition, but as competition over user structure, token structure, and revenue flywheel structure.

GPT Route: Mass-Entry AI

Strengths are user scale, brand, entry points, consumer subscriptions, and ecosystem imagination. Risk is excessive low-value consumer traffic creating compute burden, data noise, and product direction dilution.

Claude Route: Professional-Production AI

Strengths are enterprise, developer, code Agent, high ARPU, and high-value workflows. Risk is weak consumer entry, limited distribution, and sensitivity to enterprise budgets and compute supply.

This paper does not argue that Claude has comprehensively surpassed GPT across all dimensions. The more accurate assessment is: GPT remains the consumer-side scale leader, while Claude is closer to a healthy revenue flywheel in enterprise, developer, code, and Agent high-value scenarios.

If GPT continues to fixate on massive consumer data and mass entry points without completing Token Triage and high-value task specialization, it may slide toward the platform risk of being “large but diffuse, strong but not sharp.” Conversely, if it can localize and lighten low-value interactions while toolifying and productizing high-value tasks, it can still rebuild a positive flywheel.

Chapter 12: The Correct Economics of AI

The correct economics of AI does not study who has the most users, the most tokens, or the most model parameters — it studies how tokens convert into verifiable, reusable, deliverable, and monetizable economic value.

1. Measure Returns, Not Buzz

User activity, downloads, and session counts only prove usage — not returns.

2. Measure Per-Outcome Cost, Not Token Volume

High token usage may indicate high output, or high rework, high chaos, and high waste.

3. Measure Task Value Density, Not User Identity

B2B is not inherently high-value; consumer is not inherently low-value. What matters is whether the task is verifiable, reusable, and monetizable.

4. Measure the Revenue Flywheel, Not AI Usage Rate

AI usage rate is process; income statement impact, delivery quality, and cost reduction are outcomes.

AI failure does not necessarily come from insufficient model capability, but may come from three types of mismatch: user structure mismatch, cost structure mismatch, and management metric mismatch. Only tokens that enter a revenue loop constitute genuine productivity.

The true watershed for the AI industry is whether tokens can transform from cost units into value units. The hardware side has achieved deterministic cash flow, the application side is proving the revenue loop, the enterprise side needs to escape AI formalism, and the consumer side needs to move from low-value generation to high-value output.

The correct economics of AI is ensuring that every unit of intelligence cost enters a verifiable value loop.

References & Data Sources

The following sources support this paper’s trend assessments. This paper does not use them as experimental proof, but as external data anchor points for a thought paper.

NVIDIA, “Financial Results for Fourth Quarter and Fiscal 2026”: FY2026 revenue of $215.9B; Q4 data center revenue of $62.3B.
OpenAI, “A business that scales with the value of intelligence”: OpenAI ARR approximately $2B in 2023, approximately $6B in 2024, over $20B in 2025.
Menlo Ventures, “2025: The State of Generative AI in the Enterprise”: 2025 enterprise GenAI spending of $37B; application layer approximately $19B.
Sensor Tower, “State of AI Apps Report 2025”: H1 2025 generative AI app downloads approaching 1.7B; IAP revenue approaching $1.9B.
Gartner, “Over 40% of Agentic AI Projects Will Be Canceled by End of 2027”: Agentic AI project cancellation risk.
Business Insider / Jellyfish, “tokenmaxxing”: Top 10% of AI users consume approximately 10x the tokens, yet productivity is only approximately 2x.
Deloitte, “AI ROI: The paradox of rising investment and elusive returns”: The gap between growing AI investment and measurable ROI.
Apple Machine Learning Research, “Apple Foundation Models Tech Report 2025”: On-device models and Private Cloud Compute cloud-edge synergy.
HBR / BetterUp Labs / Stanford research on workslop, and RedMonk public discussions on AI Slop.