Artificial intelligence is mispriced by an order of magnitude. The industry sells at an inference below marginal cost and calls it growth; the gap will close through higher prices and lower usage. We are currently living through the most expensive "free trial" in human history, where the true cost of our curiosity is being hidden by billions in venture capital.
As we look toward the 2026-2027 fiscal cycle, the artificial intelligence industry is approaching a reckoning point where subsidised experimentation must give way to sustainable business models. Here is a strategic outlook on the future of AI economics.
For the past several years, frontier model labs have operated under a "loss-leader" strategy, heavily subsidising the cost of compute to gain market share. Anthropic, for example, has indicated that a standard USD 200 monthly subscription can actually represent upwards of USD 5,000 in underlying compute costs. Other historical data also suggests that major AI players like OpenAI would need to increase prices by as much as 25 times just to reach a break-even point.
To sustain growth, these companies must not only cover daily operations but also recoup five years of losses, pay back venture capital, and replace rapidly depreciating hardware like GPUs and RAM. The bill for the last five years of AI progress is coming due. Businesses that build AI into their core workflows today, based on subsidised prices, may find themselves cornered when the market finally demands its 40 times correction. So why is this happening?
The Wall of Diminishing Returns
Let’s delve into the law of diminishing returns in the context of AI more deeply. We can think of AI as a sports car built on three fundamental parts: the data (the wheels), the model (the engine), and power consumption (the fuel).
The industry is shifting from an exponential growth phase to a plateau of diminishing returns across these three primary dimensions.
In our analogy, data represents the wheels: the point of contact between the machine and reality. Most high-quality human-generated data has already been consumed. The remaining data is increasingly "polluted" by AI-generated content. Feeding a model its own output (and therefore, output not produced by a real human) leads to a phenomenon called “model collapse” (imagine a punctured wheel), where the system loses its ability to represent reality and fails under its own weight.
The model is the engine.
While we have been taught that a bigger engine equals more power, we are discovering that doubling the engine size no longer doubles the performance. Frontier models currently sit between 3 and 5 trillion parameters. To achieve a true "human-replacement" level of intelligence where no human oversight is required, we may need models scaled to 50 trillion or even a quadrillion parameters. However, the compute required to serve such a model at peak performance would likely exceed any price a consumer is willing to pay.
Power Consumption Crisis
Power is the fuel that keeps the entire infrastructure running. As the "engine" gets bigger, the fuel requirement isn't just increasing, but also becoming a logistical nightmare. As we add more layers to models, the environmental and financial cost of electricity becomes a primary bottleneck.
A single query to a frontier LLM can consume as much energy as keeping an LED lightbulb on for an hour. By 2026, data centre demand is projected to double in many regions, leading to "energy-based pricing" where peak-hour AI inference carries a premium cost. Furthermore, for every watt of power used for computation, nearly another watt is often required for cooling, effectively doubling the operational cost of massive data centres.
The shift from massive, "jack-of-all-trades" frontier models to more distilled Small Language Models (SLMs) is not just a technical preference, but a financial necessity for survival. As the cost of generalised intelligence becomes prohibitive, the industry is moving toward "surgical" AI that prioritises depth over breadth.
The current economic "sweet spot" appears to reside in models at or below the 70 billion parameter mark. Unlike the 3-to-5 trillion parameter giants that require a dozen or more GPUs just to serve a single user, these smaller architectures can be served with a fraction of the hardware, allowing companies to actually achieve a positive margin.
These models achieve high performance by being trained on curated, high-density data for specific domains (such as legal analysis, medical diagnosis, or software engineering) rather than attempting to "consume the entire internet". There is a growing prediction that the market will eventually "throw away" the current generation of frontier models because they are simply too expensive to exist in a real, unsubsidized market.
Furthermore, the most disruptive feature of SLMs is that they break the monopoly of the "Big Cloud" providers. Models under 70B can be run on local workstations or private servers, allowing businesses to bypass the 30x - 40x price hikes expected from major AI labs.
SMLs give better control over prices, and when prices move from flat subscriptions to metered usage, demand sorts itself. At USD 20 per seat, idle usage is tolerated. At USD 200 - USD 2,000 per workflow, every query must justify itself.
The arithmetic is simple. At USD 25 - USD 50 per hour for skilled labour, a USD 500 monthly AI workflow must save 10 - 20 hours to break even. Low-value uses (summaries, drafts, generic support) fail that test first. High-value uses (code generation in production, legal review with liability, medical triage) survive because they replace or compress expensive labour.
While the "free lunch" of subsidised AI is ending, this transition marks the beginning of a more mature, value-driven era.
Instead of fearing price hikes, businesses should view this as a "Quality Filter." When AI costs its true value, we will stop using it for trivial tasks and start focusing on high-impact algorithmic improvements. This shift will drive a new wave of innovation centred on efficiency engineering, i.e., creating breakthroughs on the level of the original Transformer architecture that allow us to do more with less.
The winners of the next decade won't be those who integrated the most AI, but those who used the current "subsidy window" to radically upskill their workforce. By treating AI as a high-powered reference tool rather than a crutch, companies can build lean, intelligent workflows that remain profitable even as market prices normalise. The future of AI is not just about intelligence; it is about the intelligence of how we use it.
Disclaimer: The views expressed in this article are those of the author and do not necessarily reflect the views of the publication. |