I got Grok to deprecate tokenomics (pun intended)

Regarding the topic of tokenomics (the economics of token usage, that is, how tokens are counted, priced, consumed and optimised in interactions with large language models), there have been clear indications since the interview that demand for the best tokens remains exceptionally strong. Indeed, Patel himself mentioned in the interview that a hedge fund was already paying an astronomical sum for preferential access to the highest-quality tokens with each underlying model update. It is therefore only a matter of time before the providers of large language models start charging their customers real money for these top-tier tokens. A very recent video, ‘The $15,000 AI bill. Your $20 subscription is a DELUSION’, makes precisely this point.

My interactions with Grok on this topic began when I tried to retrieve a post on X by a former senior staff software engineer at Google (the creator of Keras during his time there, who left in late 2024, and the author of the classic Deep Learning with Python. I could not initially recall his name, only that he was French.

He had posted a tweet expressing scepticism about GPU depreciation and was therefore bearish on the sector:
François Chollet @fchollet The reason it is so important for everyone to keep pretending that AGI is definitely right around the corner is that there is now over $1T of investment riding on this belief (either already expended, or committed). Current (and recent past) capex cannot be justified by current use cases and technology (currently spending $10-15 to make $1). To ever be in the black you'd need dramatically better tech/applications, and you'd need them fast -- before current datacenters depreciate, which is a 3-5 years timescale. 8:42 PM · 19 oct. 2025 · 331,1 k seen https://x.com/fchollet/status/1979981471571222679

More context from Dylan Patel’s full interview (Latent Space podcast, 2026)
Patel doubles down on the bullish hardware economics that directly counters Chollet’s depreciation warning. He calls the “GPUs last <5 years” narrative “complete nonsense,” pointing to real-world extensions: 3–4-year-old Hopper clusters are being re-signed for another 3–4 years, older A100s for multiple more years, pushing total useful life toward 7–8+ years. H100 resale/rental prices are “skyrocketing” because demand for tokens is exploding faster than new supply can arrive.
The colorful language you’re remembering is Patel’s recurring “token rich vs. token poor” framing (he doesn’t say “paupers,” but the vibe is identical). He warns: “if you don’t use more tokens you’ll never escape the permanent underclass.” Companies or teams without sufficient compute get rate-limited, pay premium prices, or simply fall behind while the “token rich” (top labs, Citadel-scale players with exclusive access) extract exponentially more value. He jokes about begging Anthropic’s co-founder for Mythos access and pretending the model doesn’t exist — the laughter in the room underscores how brutally the market sorts winners from “token paupers.”
Patel’s optimistic reading: software efficiency + insatiable token demand keeps old GPUs valuable longer, margins expand, and the capex bet pays off. Chollet’s skeptical reading: without breakthroughs before the 3–5-year depreciation window, the whole $1T edifice looks shaky.

In the great AI gold rush, Dylan Patel of SemiAnalysis offers a cheerfully brutal taxonomy: the token-rich versus the token-poor. Call the latter “token paupers” — organisations, teams, or individuals stuck on the wrong side of the compute curtain. Patel’s data is merciless and his metaphors sting with laughter. H100 clusters that sceptics once wrote off after five years are now being re-signed for three or four more. Older A100 fleets are getting fresh multi-year extensions. Useful life, he insists, is stretching towards seven or eight years — maybe longer. “Complete nonsense,” he calls the five-year obituary.

The secondary market tells the same story louder: H100 resale and rental prices are skyrocketing. Supply cannot keep up with exploding token demand, so yesterday’s hardware keeps earning its keep. Patel paints a picture of frantic re-signings and margin expansion across the stack. Cloud providers, hardware makers, and even memory suppliers are all riding the same wave. Demand for tokens is so voracious that the entire supply chain feels “sold out.” People with a pulse are fighting over incremental capacity years into the future. TSMC’s capex could hit $100 billion by 2028, and downstream vendors like ASML are already booked solid.

This is where the humour turns dark and the competitive reality bites. Patel warns, almost with a grin: “If you don’t use more tokens you’ll never escape the permanent underclass.” Token paupers — the scrappy SaaS startup, the rate-limited research team, the enterprise stuck on legacy contracts — get priced out fast. Meanwhile the token aristocrats (top labs, Citadel-scale players) lock in priority access, first dibs on the newest models, and outsized returns. Patel recounts begging an Anthropic co-founder for Mythos access and pretending the model doesn’t exist, drawing laughs because everyone in the room recognises the desperation.

His own firm’s numbers make the point vivid. Spend on tokens has ballooned from tens of thousands to seven million dollars a year. One person using frontier models can now replace entire teams. “If this person can do the work of five to ten to fifteen people,” Patel notes, the maths is obvious. Yet the token-poor stay trapped in old workflows, paying premiums or simply falling behind. The market is brutally efficient at sorting winners from also-rans. Good ideas are cheap; execution is now trivial — provided you have the tokens to run it.

In this new hierarchy, being deprived of tokens is no longer a budgeting footnote. It is a structural disadvantage that compounds daily. Models improve faster than infrastructure can scale, concentrating power among those who can pay or negotiate their way to the front of the queue. Token paupers watch from the sidelines as the token-rich pull further ahead, turning compute access into the defining moat of the decade.

Patel’s message is equal parts optimistic and Darwinian: the hardware lasts longer than expected, the margins are expanding, and the demand curve is still pointing straight up. But if you’re not aggressively spending on tokens today, you’re already falling behind — and the gap only widens tomorrow. In the age of AI, token poverty isn’t just inconvenient. It’s career-ending.

Lausanne, the above was published on the fourteenth day of the sixth month of the year two thousand and twenty-six.