🧠 AI's Cost Crisis: How Fintech Could Save Billions

Welcome to Fintech Brainfood, the weekly deep dive into Fintech news, events, and analysis. You can subscribe by hitting the button below, and you can get in touch by hitting reply to the email (or subscribing then replying)

Hey Fintech Nerds 👋,

I’m in Madrid at the Tempo offsite after a Money 2020 Europe that was very stablecoin heavy. By booth, by content, and by dinner conversation. Stablecoins are real folks. The payments nerds are using them even if The Clearing House and big banks aren’t yet.

Ramp’s enormous $750m Series F at $44bn is a significant market signal. Yes, they’re now one of the top 15 largest private companies in the world (and one of three fintech companies behind Stripe and Revolut). But the signal is their AI narrative. Yes, they have phenomenal growth. But Ramp has given a masterclass in turning AI into a tailwind, and positioning themselves to solve the coming cost crisis in AI. Which, in a week where Uber said they burned through the entire annual AI budget in 4 months, is a big need. I cover this more in the Rant this week as a theme, and as a Thing to Know.

In other news, Mastercard enabled multiple new card-network settlement options with stablecoins. Meaning, if you’re a fintech company or bank, you can now settle 7 days a week, and on bank holidays. This is becoming a market default, and a market expectation. Money is becoming default 24/7, and you should plan accordingly.

Read online

Love Fintech Brainfood? You’ll ADORE NerdCon. Going deeper on Agentic Commerce, How fintech companies are harnessing AI with the operators who read this every week. 11/19-20.

Here's this week's Brainfood in summary

📣 Rant: AI’s cost explosion needs a Fintech solution.

💸 4 Fintech Companies:

Aido Lighthouse - Semrush for Agentic Commerce
Roadrunner - AI-Powered Revenue Ops
Primitive - Agent control plane for large FIs
Exponent Fi - The Franchise Finance Platform

👀 Things to Know:

Content Corner: Regulation E and Zelle

Weekly Rant 📣

🧠 AI’s cost explosion needs a Fintech solution.

As AI creates new cost challenges, new categories of Fintech companies and products are being created. Billing for AI products and fraud screening is already shifting. What comes next is turning the raw infrastructure of AI, compute itself, into a vertical financial stack to manage the coming cost explosion.

Uber burned through its entire annual AI token budget in four months. GitHub Copilot, Anthropic, and OpenAI quietly switched to token-based billing, and there’s a reckoning coming. Enterprise AI clearly has a cost problem that we’re only just starting to see.

Fintech is beginning to react. There’s a swathe of AI billing and AI routing companies helping to manage this cost problem. Ramp just got a $44bn valuation on the vision of being the token spend platform. Stripe acquired Metronome, which specializes in this type of billing. Fal AI and OpenRouter help companies buy tokens from lower-cost providers (like DeepSeek, which is 1/30th the cost). And a generation of “harness” or control plane companies is emerging to help manage token spend, security, and compliance.

❝

Ramp says AI token overspend is much worse than SaaS sprawl — at least there you had a monthly bill to check. With AI tokens, what do you check? What’s the dashboard? So Ramp built what they needed: an ability to measure token usage directly, the same way you can measure and manage SaaS spend.

Tokenmaxxing is Stupid Until it Isn’t.

I covered how companies are starting to do more internally to control their token cost in Tokenmaxxing is Stupid Until it Isn’t. And you should read that if you haven’t already.

But as you diagnose and begin to address the cost problem, I believe enterprises will eventually want to control where their tokens are generated. The hardware, the neoclouds, and the data centers.

I think the next era of fintech category is being born today.

A couple of weeks ago, I profiled three companies that specialize in managing compute (the billing, pricing, and aggregation of GPU workloads). While writing, I couldn’t shake the feeling that something bigger was forming. I think compute marketplaces, sourcing, and billing are becoming a whole category of finance.

Because not everyone buying compute is an AI lab. As cost pressures bite, companies will want to train, fine-tune, and use custom models, or run models and cost-competitive infrastructure.

Broadly, AI is becoming a utility input to the entire economy. So the infrastructure that finances, prices, and manages that input will become critical. At least as critical as energy finance. Possibly more so.

Because no commodity in history has had demand compound this fast.

Today

Compute is becoming a commodity - but isn’t yet priced like one.
The compute financial stack has three layers: price discovery, sourcing, and billing.
Compute is hard to commoditize but the demand is real
And the wall st guys are circling

1. Compute is becoming a commodity.

Anthropic’s revenue explosion is the fastest in history, bar none. It’s extraordinary.

$700 billion in planned capex from four companies. Compare this to global telecoms during the dotcom era, where even today, the entire industry spends $300 billion. Oil and gas, the commodity that powered the 20th century, is around $1 trillion. We’re watching compute capex approach energy capex in a fraction of the time.

Meta, Microsoft, Alphabet, Amazon have funded capex from free cash flow but that model is showing early signs of breaking. All the hyperscalers are turning to capital markets to fund their expansion ambitions.

Meta is partnering with Blue Owl Capital (yes, that Blue Owl) on a $27bn data center project.
Oracle debt is trading like a junk bond with credit default swap spreads flaring.
The FT did a deep dive into the sheer scale of private capital coming into the AI building boom.

But as AI is scaling, the exponential growth in costs cannot be borne by the customers of compute.

Uber’s CTO blew through the company’s entire 2026 AI budget in the first few months with per-engineer API costs running between $500 and $2,000 a month. NVIDIA’s VP of applied deep learning said that for his team, compute costs now exceed employee salaries. Ramp’s data shows average monthly AI token spend across its customers up 13x since January 2025.

This is unsustainable.

And it’s not just Uber. Microsoft canceled thousands of internal Claude Code licenses last month after costs spiraled past expectations, six months into the pilot. GitHub is moving all Copilot plans to usage-based billing on June 1st, explicitly because agentic workflows consume too much compute for flat-rate pricing to survive.

Recently Polymarket reported on a company who’d allegedly spent $500m on AI tokens in a single month after not setting employee token limits. And Goldman Sachs says they expect AI agent token usage to explode by 24x by 2030 and token usage is about to explode (at least according to Goldman).

— # (#)

“Goldman’s bullish case is that monthly token use could reach 120 quadrillion by 2030, while inference cost per token keeps falling 60%-70% per year. The fight is now between agent productivity and token waste.”

The start-ups have already understood this, and you can see it in the latest Brex spend data, the two companies winning new spend are fal.ai and openrouter. Companies who help you buy AI tokens from cheaper, open weight models like Qwen and Moonshot.

Every company selling AI on a subscription is running toward a pricing cliff, and the companies buying AI on a subscription haven’t realized the price is about to change. Or, in many cases, the lab you signed with quietly already has, and is building its product to consume far more tokens with longer-running tasks.

We’ve seen this kind of price shift before.

In 2010, AT&T killed unlimited data plans because smartphone usage was overwhelming the infrastructure. In 2026, Anthropic, Github and soon, anyone who sells an AI product will be changing pricing because AI usage is overwhelming infrastructure.

With the SpaceX S1, we’re starting to see these massive companies go public. Right now, AI labs can subsidize usage to grab market share. In public markets, the pressure on margin will be substantial, and the era of burn-baby-burn for tokens is reaching its final countdown.

xAI recorded an operating loss of $2.47 billion on $818 million in revenue during the first quarter of 2026
OpenAI estimates an annual loss of roughly $14 billion for 2026 (on $13bn revenue)
Anthropic has a positive contribution margin when you exclude training. And they were the first to cut off all-you-can-eat billing for some users.

The solution here as Goldman suggests, is likely inference costs getting cheaper, and companies getting much better at token efficiency. As I wrote in 🧠 Tokenmaxxing is stupid until it isn’t.

❝

AI adoption is not effective AI adoption. In most cases, it is the opposite.

Observable token usage gives your enterprise the ability to tell which teams are burning tokens productively and which are running agents in idle loops, and who’s just using ChatGPT to produce slop instead of doing the work.

The first signs of AI’s cost explosion showing up in Fintech is that spend management is becoming dedicated token spend management.

Ramp’s entire Series F product blog centered on the idea that tokens are the third major category of spend after labor and vendors. Therefore, they need to be managed by CFO’s at that macro level.

But there’s something even more interesting happening a layer further down.

Most committed data center builds have not yet been completed; most data center projects are behind schedule. Only about 3 out of 10 are on track. The rest are delayed or canceled. The arrival of new compute supply is not compressing the cost of producing tokens, because supply is not arriving fast enough.

Instead, at the compute layer, raw token generation in data centers is becoming more competitive. As neoclouds and data centers emerge, they’re creating a market where the ability to quickly produce cheap tokens from cheap energy is a competitive advantage. As this layer becomes more competitive, a new type of fintech company is emerging.

The pricing, billing, and financing infrastructure is still limited.

We need new financial products for this market. The market demands them. Oil got this in the 1980s, electricity in the 1990s, and bandwidth... well, we’ll come back to bandwidth.

The emerging compute financial stack has three layers.

2. The Compute Financial Stack

Layer 1: Price Discovery

What is it? Finding the price of compute infrastructure (e.g., H200 GPUs) across multiple providers.

GPU compute has historically been priced by the hyperscalers in giant one-off deals for the labs, and this type of compute pricing will continue (see Anthropic’s new $2.5bn deal with SpaceX for Colossus data center access for inference)

This stack is for everyone else. The folks not spending billions, tens of billions or hundreds on compute.

For example: Silicon Data, tracks rental pricing across major GPU architectures (H100, A100, H200, B200), normalized for configuration and provider. Backed by DRW and Jump Trading, they launched the first daily GPU rental price index on Bloomberg in 2025. For the lenders writing large loans to companies buying compute they now have a fair way to assess compute pricing.

As of early 2026, Silicon Data was showing zero on-demand availability across 90% of providers and renters subletting clusters to each other. When you can’t even find supply at any price, you don’t have a market. You have a commodity screaming for price transparency.

This scenario is exactly like the 1970s oil crisis. OPEC embargoes caused severe supply shocks, leaving gas stations empty. Because buyers couldn't find oil at any price, the market grew desperate for transparency, directly leading to the creation of the WTI crude oil futures market on the NYMEX to find fair value.

There’s just one issue: an H100 is not a B200, and it is not a GB200. Corn is still corn. Can you really standardize something this heterogeneous?

Layer 2: Aggregation and Sourcing

What is it: Finding GPUs you can use for the AI task in front of you, and helping you run that task.

Market context: Companies are now building their own foundation models, like Revolut, Nubank, Ramp, Cursor, and Browserbase. These models outperform the frontier models if they’re trained on a specific workflow with a custom, non-public data set. Cursor now has more data about coding than is available on the web. But they need to find somewhere to actually fine-tune or build that model.

For example: Prime Intellect aggregates GPU supply from 50+ providers and helps companies train, fine-tune, and run evals on their models. They’ve built out the RL post-training infrastructure that lets you close the loop from deployment back to retraining for clients like Ramp, Browserbase, and even NVIDIA.

The missing piece is a competitive cost marketplace. A place where GPU supply meets demand with price signals, not email threads. Prime Intellect is closer to that than anyone outside the hyperscalers.

Layer 3: Billing and Metering

What is it: Billing and metering for the utilization of the GPUs and compute.

Market context: The compute economy is fragmented in a way SaaS wasn’t. Neoclouds, regional colos, small GPU providers don’t have the engineering teams to wire up a generic billing API.

For example: Internet Backyard automates the quote-to-cash workflow for data centers and GPU providers: quoting, metering, invoicing, collections, and payment routing. Their first product, gnomos, replaces the spreadsheet-and-manual-handoff chaos that sits between sales, ops, and finance at most compute providers.

The “Stripe for GPU billing” positioning is sharp, but then I wondered if Stripe itself just absorbs this category, too?

This is a category regardless of who wins it.

The longer-term play for all three layers is data. Whoever aggregates pricing, supply, and billing data across the compute economy controls the information layer. That’s the play that turns plumbing into a platform.

(I also wonder if this entire stack needs to become verticalized. I’d imagine if someone did it, you’d be looking at a double-click-sized moment that creates the behemoth of the AI future.)

But before I get way ahead of myself, all of that assumes we can actually turn compute into a commodity.

3. Can Compute Actually Be Commoditized?

Corn is corn. An H100 isn’t a B200.

Every configuration, network connection, and the entire power supply chain matters immensely. Not all data centers are fully reliable because of their power source or capacity. So you’d be right to assume it’s hard to write a futures contract with a market that is so far away from a standard like “Brent Crude.”

Except.

Anthropic’s CFO, Krishna Rao, said something on Invest Like The Best that reframed this for me. He described how Anthropic uses three chip platforms (Amazon Trainium, Google TPUs, and NVIDIA GPUs) fungibly. They built a custom orchestration layer that lets them swap workloads across architecturally different chips, and he explicitly framed this flexibility as a strategic advantage.

❝

Across those three chip platforms, we're using compute for all of those internal and external uses. And that flexibility—it actually took us a long time to be able to do that... We've invested very heavily to be able to use that compute incredibly flexibly. And then we look across the different generations of those chip platforms and use each generation for the best workload internally.

Anthropic CFO Krishna Rao

If the company building frontier models treats its own compute as fungible in practice, the market can treat compute as fungible in pricing. The asset doesn’t need to be identical. It needs to be interchangeable enough for a contract to reference it. And remember, mortgages aren’t identical either. We can still wrap them into a container, and in a way, that’s what Silicon Data’s index, Ornn’s OCPI, and eventually CME’s futures are doing for compute.

While brainstorming this, Claude helpfully pointed out we’ve had false dawns with commoditizing tech infrastructure before.

And that false dawn was rather infamously Enron. Yes. That Enron.

Quite aside from being one of the most infamous corporate scandals in history for their flagrant accounting fraud, they did try to commoditize a new asset.

They tried to build a bandwidth trading floor in 1999, making the exact “bandwidth is the new commodity” argument that people make about compute now. And you guessed it, it collapsed. The thesis was actually right (bandwidth did eventually commoditize), but the market structure was premature. The liquidity wasn’t there.

Bandwidth commoditized in price through a massive tech supply shock. Wavelength Division Multiplexing (WDM) multiplied fiber capacity overnight, crashing transit prices by 30% to 50% annually.

Yet, bandwidth never became a financial commodity like corn. Because it is location-bound and perishable and unused capacity vanishes instantly. But it did evolve into a wholesale utility market, behaving more like air cargo or commercial real estate.

This is similar for AI compute: Compute needs to become a utility on price.

The difference for compute: CME is building the exchange, DRW is backing the index, and actual enterprises are already making fungible compute decisions at scale. That’s a different starting position than Enron’s bandwidth play.

So if compute is commoditizing, even imperfectly, what happens next?

Once you have a commodity - Traders want to trade it.

The CEO of BlackRock, Larry Fink, told the Milken Conference that a new asset class will be buying futures of compute because of the power supply chain it relies on so heavily:

❝

"We're short power, we're short compute, we're short chips... I actually believe a new asset class will be buying futures of compute.”

BlackRock CEO Larry Fink

Five days later, CME Group and Silicon Data announced they’re building exactly that. A compute futures market, pending regulatory review, and launching later this year. DRW’s Don Wilson called compute “the largest commodity in the world.”

The contracts will be cash-settled against Silicon Data’s daily GPU rental price indices. AI companies can hedge training costs. Data center operators can lock in revenue. Lenders can underwrite GPU-backed debt against a reference rate instead of a spreadsheet guess.

The financial stack for oil took decades to mature. Electricity took years. Compute is building its stack in months.

We have at least $700 billion in annual capex (and growing), trillion-dollar infrastructure commitments, and no financial infrastructure to price, source, bill for, and manage the risk. When that happens, we usually see the price be commoditized, and if it becomes a critical economic input, we get commodity exchanges and futures markets.

That’s why I think this is the next great financial services infrastructure buildout.

Because today this all runs on spreadsheets, Slack threads, and vibes.

That won’t last for long.

ST.

4 Fintech Companies 💸

1. Aido Lighthouse - Semrush for Agentic Commerce

Aido lighthouse continuously monitors your website to ensure agents have the best possible experience from your APIs, checkout flows, and data. The service performs over 110 checks (like WebMCP, MCP, and checkout flows to generate a single “readiness score” and give a prioritized list of fixes.

🧠 Agentic commerce is a gold rush (albeit one yet to produce much gold). Aido is selling shovels. What Semrush did for SEO, Aido aims to do for agentic commerce. What I like is their real-time view browser so you can see where agents got stuck, and whats driving a specific recommendation. Unlike SEO where agents can simply find you, Aido wants to tell you if they can buy from you. They also built the Agent-Optimized Catalog Feed (AOCF), an open extension to Google Merchant Center, which is well worth a look if you’re into that sort of thing. But what happens if Shopify and BigCommerce ship this capability?

2. Roadrunner - AI-Powered Revenue Ops

Roadrunner helps sellers turn a customer opportunity into a complex quote for enterprise pricing. It answers “how should I structure this deal” from natural language prompts like “ add a discount tier if they hit 1m API calls.” They’ll white-glove onboard users from legacy quote tools like Salesforce CPQ, Conga, or Dealhub, and it even generates pricing scenarios.

🧠If time kills all deals, getting to the quote faster helps keep deals alive. Most start-ups have a handful of RevOps folks and a spreadsheet. Enterprises have big legacy tools they don’t like. Can they displace existing legacy tools, or will those tools update themselves to compete? And is the better approach to go with the growth companies and displace their spreadsheets? Maybe both. Roadrunner also has a really nice AI feature, quote scoring against your own closed-lost history. That’s super hard for an existing deal desk to do.

3. Primitive - Agent control plane for large FIs

Primitive helps financial institutions manage, govern, observe and secure AI agents from multiple providers into a single hub. It helps track token usage, outcomes, and performance in real-time. Institutions can also quickly build agents in its “assembler” and manages secure connectivity to enterprise core systems and tools.

🧠 Every FI needs this. Their choice today is to go all in with one lab (e.g. OpenAI’s Foundry), or build a harness internally (like Ramp or Block have). Given the high burden of security and difficulty of enterprise integrations, DIY is rare. BBVA went all in with OpenAI which could work, or could become costly in time. JP Morgan is building their own, because they’re big and they can, but it also takes time. Everyone else is still trying to figure out what they do next. Bullish on this concept. (Discl: I’m not an investor but the founder is an old friend and mentor)

4. Exponent Fi - The Franchise Finance Platform

Exponent Fi helps Franchisees manage their franchise loan, uses AI to help manage accounts, and issue corporate charge cards. The franchise loan dashboard creates a single dealroom for all document intake, tracking, and closing. Then the corporate card helps manage franchise expenses.

🧠The platform is a play to get the data to perform better underwriting. Where banks see what data you submit to them, Exponent Fi sees everything: your spending, your loans, your accounts. By giving you the tools to make your life easier, they get better data. Simple. Brilliant. Exponent Fi is a licensed SBA lender in all 50 states and the type of company that tech enables.

Things to know 👀

1. Ramp raised $750m at $44bn

Ramp says the new capital raise, led by ICONIC and GIC, to build the cost infrastructure for AI. The big picture the investors are buying is that there's a structural change to the economy coming. Ramp says purchase volume grew ~170% year on year in March, the fastest in three years, at roughly 20x the size they were 3 years ago

🧠 That’s 8.5x larger than Brex’s exit to Capital One. Brex created this category. Brex is an incredible company by any measure. Capital One bought the whole company in April for $5.15B, a steep markdown from Brex's $12.3B peak.

🧠 Growth rate matters. Ramp is growing faster now than it has for the past 3 years. That shouldn’t happen. Outside of the AI companies, it’s extremely rare.

🧠 Why raise now? Every growth investor is looking for companies that benefit from (rather than are threatened by) AI. Ramp has a credible story, in its growth metrics and in its product focus.

🧠 Ramp has the clearest and most coherent AI narrative of any financial services company. Ramp’s launch blog defined tokens as the third major cost of companies. We had talent and vendors; now we have tokens. Those tokens are exploding in cost, and Ramp will be the way to manage token spend.

🧠 Ramp shipped Stack, a "Harvey for Accountants." An AI operating system that codes transactions, posts journal entries, and runs the close, sitting inside the firm's data instead of bolted on top.

2. Mastercard expands settlement options including stablecoins.

Mastercard announced plans to expand settlement capabilities to include stablecoin, intraday, holiday, and weekend options, giving partners more choice in how and when transactions are settled. Settlement works with USC, USDG, USDP, PYUSD, RLUSD, and SoFiUSD across Arbitrum, Base, Canton, Ethereum, Polygon, Solana, Tempo, and XRPL. ARQ Finance, CBW Bank, Cross River, Lead, and Nuvei will be among the first to support.

🧠 Mastercard's card settlement used to take weekends and holidays off. That just changed. Stablecoins make 24/7 an option for anyone who wants to settle faster.

🧠 And if you're thinking "but my card works on Sundays," consider that the money behind it moves the next business day, and a long holiday weekend could stretch that to Tuesday. Which SUCKS if you're a merchant and you're waiting to get paid.

🧠 24/7 is becoming a default and an expectation. Companies like MoneyGram, dLocal, Deel, Remitly and countless others have experienced the benefit of 24/7 treasury and payments. They want that now; they want to launch cards. If you don’t support that you don’t get their business.

Good Reads 📚

1. Regulation E and Zelle

Patrick walks through how US electronic payment protections were built, why cards have a whole liability-transfer stack behind them, and why Zelle breaks the model. Cards can push losses down to merchants, processors and networks. Zelle mostly cannot. So banks have an incentive to define fraud as “authorized” whenever the customer touched the phone.

Tweets of the week 🕊

— # (#)

That's all, folks. 👋

Remember, if you're enjoying this content, please do tell all your fintech friends to check it out and hit the subscribe button :)

Want more? I also run the Tokenized podcast and newsletter.

(1) All content and views expressed here are the authors' personal opinions and do not reflect the views of any of their employers or employees.

(2) All companies or assets mentioned by the author in which the author has a personal and/or financial interest are denoted with a *. None of the above constitutes investment advice, and you should seek independent advice before making any investment decisions.

(3) Any companies mentioned are top of mind and used for illustrative purposes only.

(4) A team of researchers has not rigorously fact-checked this. Please don't take it as gospel—strong opinions weakly held

(5) Citations may be missing, and I’ve done my best to cite, but I will always aim to update and correct the live version where possible. If I cited you and got the referencing wrong, please reach out

🧠 AI’s cost explosion needs a Fintech solution.