Nebius: The World’s Biggest Factory Will Produce Tokens

$NBIS Q1 2026 ER Update

May 23, 2026

The most important line from Nebius’ Q1 2026 update was not only revenue growth. It was the shape of demand.

Marc Boroditsky, CRO at Nebius, put it clearly:

The referenced pipeline growth of 3.5x a quarter, which -- 3.5x quarter-over-quarter, which we’re very proud of, is for our AI cloud business, and it does not include any strategic hyperscaler deals like the Meta deal. It does include qualified opportunities across our core AI cloud and Token Factory products as well as across all of our key customer segments, including AI natives, software vendors and enterprises.
What we can share about conversion is that we have maintained our solid win rates at the same time as we’ve accelerated our sales cycles and increased our average selling prices. And you can see this with some of the strong wins that we have, such as Sword Health in healthcare, life sciences and Rhoda and 1X in physical AI and core automation in one of our AI-native model builders as well as Revolut and monday.com, new customer wins for Token Factory.
What we are doing is enabling our go-to-market teams to have a consultative conversation with our customers about their plans and for current and future workloads, including, as an example, what they’re thinking about with regard to Vera Rubin. We’re also focusing and scaling our go-to-market and success teams to help customers to realize their plans, which turns into durable revenue for us.

This is exactly what I wanted to see.

Nebius is not simply selling GPUs into a hot market. It is starting to convert GPU scarcity into a broader AI infrastructure relationship. That is a very different business.

The company’s ability to deliver both high-performance, large-scale clusters and smaller-scale, on-demand compute is becoming a real advantage. In Q1, that advantage translated into a sequential re-acceleration in revenue. The momentum was broad-based and compounding:

Pipeline generated in the quarter reached a new record, increasing approximately 3.5x quarter over quarter.
Pricing continued to rise for new-generation GPUs.
Older-generation chips also saw strong pricing support.
Average deal sizes increased across both new and existing customers, driven by pricing, GPU commitments, and longer contract durations.
Demand widened across verticals, model builders, enterprises, software vendors, AI-native companies, and inference workloads.

The most important detail is that the 3.5x pipeline growth excludes strategic hyperscaler deals like Meta. In other words, this is not a single large customer distorting the picture. This is the core AI cloud and Token Factory business accelerating.

Contracted capacity already exceeds 3.5 GW, far surpassing the 3 GW goal Nebius had set for the end of the year. Because of that execution, management raised its contracted power guidance to more than 4 GW by year-end.

This matters because, in AI infrastructure, demand is not enough. Everyone has demand. The question is who can convert demand into capacity, capacity into revenue, and revenue into a deeper customer relationship.

Nebius is starting to show that it can.

The World’s Biggest Factory Will Produce Tokens

My core thesis is simple: the biggest factory of the next decade will not produce cars, steel, semiconductors, or oil.

It will produce tokens.

My thesis is that AI can dramatically expand economic output over the next few years. But I also believe that, relative to the size of the economy, there will be fewer and fewer winners. This is because the companies that successfully build AI-powered flywheels will often capture entire market segments.

Once the data/AI flywheel is built, working, and spinning, it becomes extremely difficult to stop. The company gains more users, which creates more data, which improves the product, which attracts more users, which creates more data, which improves the product again.

The flywheel spins faster and faster.

And then another force kicks in: talent.

To make the data/AI flywheel spin, you need exceptional talent. Talent is naturally limited. And the best talent will increasingly concentrate inside companies where the flywheel is already spinning, because those companies will have more ambition, more resources, better data, better compensation, and a higher probability of winning.

So the companies that reach the “spinning flywheel” stage early will attract more talent. The companies that do not reach that stage soon enough will be left behind.

This is why I believe the ingredients of an AI flywheel are:

Visionary and brave management.
Data.
Talent.

But the first factor is the trigger.

Visionary management starts the process. It chooses the right market, takes the hard decisions, collects the right data, attracts the right people, and pushes the organization through the painful transition before competitors understand what is happening.

Once the flywheel spins for a few years, the company becomes almost impossible to catch.

I also believe that most of these AI flywheels will be built by AI-native companies, not incumbents. Most incumbents carry too much legacy. Their culture, incentives, shareholders, technical debt, and profit pools make internal disruption almost impossible.

The exceptions will be founder-led incumbents where the founder still has enough power, hunger, and vision to take painful decisions against the will of shareholders focused on short-term profit and risk avoidance.

$META is one example. Mark Zuckerberg has been systematically throwing billions at new bets and restructuring the company to take full advantage of the AI revolution.

Alphabet is another example. It has been willing to put at risk its highly profitable Search advertising business in order to push the AI-based business.

But most winners will be AI native.

And many of those AI-native companies will build on Nebius Token Factory.

Why?

Because Nebius is trying to offer something more valuable than raw GPU access. It is building an environment where AI-native companies can build with ownership, access the tools they need, and generate tokens at the lowest possible cost.

That is the key.

In an AI-native economy, tokens are not just a technical output. Tokens are the cost of doing business. They are the new unit of production.

Routing data from OpenRouter suggests proprietary models still dominate usage, with proprietary non-Chinese models averaging around 70% of token volume in 2025. But the direction of travel is clear: open models are gaining share as performance improves. My view is that this mix can flip over the next two to three years.

For now, open models are not good enough for many use cases. But if they reach something close to GPT-5.5 / Opus 4.7 level, everything changes. I think that could happen sooner than most investors expect, but this remains a forward-looking bet, not a fact.

There will always be specific use cases where organizations prefer frontier closed models. But over time, that percentage should decline. Most workflows will not need the absolute best model in the world. They will need the best cost/performance model for that specific job, integrated into a reliable workflow, running at scale.

This is where Nebius Token Factory becomes interesting.

My view is that Token Factory has the potential to become the preferred environment where companies build their AI-powered workflows. From there, they can still call external frontier models for specific tasks when needed. But the main workflow, the infrastructure layer, the orchestration layer, and the economics sit inside Nebius.

That would make Token Factory a gatekeeper.

And as human labor is increasingly leveraged or replaced by AI, a larger and larger percentage of every company’s cost structure will be related to token generation.

That is why I believe Nebius Token Factory has the potential to become the world’s biggest factory.

Not because it will produce the most GPUs.

Because it can become one of the places where the new economy produces work.

Significant Token Factory customer wins in Q1 include:

Revolut, which is developing a platform to simplify finance for businesses and consumers, was able to remove human intervention from 80% of support chats and handle 1.2M chat tickets per month on Token Factory.
monday.com selected Token Factory to support its AI work platform that helps manage, orchestrate, and execute workflows.

These customers show why Nebius is not just another GPU reseller. The company is moving up the stack, closer to where the customer’s AI workload is actually built and scaled.

Roman Chernin explained the logic very well:

First of all, I want to say that we are super excited that these 2 incredible teams of talented people from Eigen and Clarifai will join us. And to deep dive into the rationale, let’s start from foundations.
Our view is that we should own the compute stack. That is where our vertical integration, our supply chain depth and our hardware engineering generate advantage and it’s also the layer that drives the bulk of our economics. But the compute stack, we built the full cloud solution and software plays the role of enabler. By the way, we partner where partnership is the right path. And as you see now, we use M&A selectively where it accelerates our road map, brings in proven developer adoption or as capabilities complementary to what we are building.
Acceleration is the key lens we apply to every potential transaction where we can find rare talent or proven adoption. This is, by the way, the example of Tavily that has incredible developer adoption that would, in general, take us meaningfully longer to build organically, acquisition is the fastest path. And we evaluate every potential deal against the clear criteria. Does it deepen customer engagement, increase lifetime value, unlock the new category of customers or use cases we can address and in general, strengthen our position as a full stack AI cloud.

This is exactly the right strategy.

Nebius does not need to become a traditional software company. It does not need software as a separate revenue stream. It needs software as an enabler of compute consumption, customer stickiness, workload expansion, and lifetime value.

Roman made this even clearer:

As I said, we look at the software as an enabler. So it’s not that we build the software to generate a separate revenue stream. The software first of all plays the role of unlocking the new capabilities for us, unlocking the new opportunities in the types of the workloads that are growing on the market and the types of customers that we can address.
Software changes the shape of the customer relationship. Every layer of software unlocks another group of users and customers. We want to meet customers where they need us and let them consume our vertically integrated solution in the way that they need, and it might be a different way for different types of customers.
Customers come to our platform for different needs. In essence, they all need to run AI at scale — which means they need compute. But for example, people who use our multi-tenant cloud, they -- to a big extent come for large training jobs. People who -- and these are like research-driven, data scientist-driven workloads. People who come to Token Factory, they build vertically integrated vertical AI products or apply AI in their enterprises. And they come for the tokens. And moving forward, we’ll see new ways to consume infrastructure at scale that will be the agentic -- end-to-end agentic workloads.

This is the point most investors still miss.

If you only look at Nebius as a GPU cloud, you compare it with every other company that has land, power, and Nvidia chips.

But if you understand Token Factory, the comparison changes.

Nebius is trying to become the vertically integrated infrastructure layer where AI-native companies generate tokens, build workflows, deploy agents, scale inference, and eventually run entire AI operating systems.

That is a much bigger opportunity than renting GPUs.

Why Nebius is well positioned to win the AI supply chain bottleneck race

The most common argument I hear from investors and finfluencers is that power supply is the new bottleneck for AI infrastructure. Therefore, companies like IREN, which already own a decent amount of connected power, will win the race against companies like Nebius, which supposedly will not build enough capacity on time to serve demand.

Supporters of this thesis even claim that companies like IREN are delaying deals on purpose, waiting for the moment when the supply/demand imbalance is at its peak, so they can extract better terms because they own the scarcest resource.

I recognize the bottleneck. It is real.

Power matters. Land matters. Copper matters. Grid connection matters. Speed of execution matters.

But I think this thesis is still too shallow.

It can work in the short term. In a very tight market, the owner of the scarce physical resource can capture a lot of value.

But over the medium to long term, let’s say beyond three years, the winner will be the company that delivers the highest value per token.

And to deliver the highest value per token, you need to deliver now.

A few months of delay can mean losing the race. In AI, timing is not a detail. Timing is strategy.

The AI race is not just about land and copper. That is the physical layer. It matters, but it is not the full game.

The AI race is about spinning the AI flywheel faster than anyone else in your domain.

We are already seeing this with models.

OpenAI has more compute available than Anthropic. Yet Anthropic has been gaining a lot of ground, especially in coding and enterprise. Why? Because Anthropic found a way to spin the AI flywheel faster in a specific domain.

It focused on coding and enterprise instead of trying to win every consumer use case at the same time. The iteration pace in that domain is faster. The feedback loops are clearer. The willingness to pay is higher. And once your LLM can improve the workflows of the people building the next LLM, the flywheel becomes extremely powerful.

At that point, the chance for a competitor to catch up becomes much lower, unless the leading company hits a technological limit soon enough for laggards to recover.

The obvious objection is:

What about compute? Anthropic is constrained. OpenAI will catch up.

But Anthropic just got helped by Elon.

And that is exactly my point.

If you have the talent, the demand, and the best product, you will find the resources.

Resources are attracted to the winning player.

As a supplier of power, land, compute, capital, or any constrained resource, who would you rather sell to? The company with the strongest product and fastest flywheel, or the company with idle capacity and no clear edge?

This is the same dynamic we see with talent. Talent flows to the winning player. Capital flows to the winning player. Suppliers flow to the winning player. Partners flow to the winning player.

This is why I do not believe the AI infrastructure race will be won simply by whoever has the most connected power today.

Connected power is valuable.

But connected power without a differentiated customer relationship becomes a commodity.

Actually, I think the market is framing the bottleneck in the wrong unit.

The scarce resource is not power in isolation.

The scarce resource is useful tokens per watt.

This is where software becomes more important than power.

SemiAnalysis InferenceX data shows how dramatic the difference can be. According to NVIDIA, recent SemiAnalysis InferenceX data shows that NVIDIA software optimizations and Blackwell Ultra GB300 NVL72 systems can deliver up to 50x higher throughput per megawatt and 35x lower token cost than Hopper for DeepSeek-R1.

Think about what that means.

If one operator owns 1 GW of connected power but converts that power into tokens inefficiently, and another operator owns less power but converts every megawatt into dramatically more useful inference, the second operator may generate far more economic output.

The market sees megawatts.

Customers see latency, reliability, model quality, throughput, cost per token, developer experience, and time to production.

Capital sees revenue per megawatt.

That is why the most important metric in AI infrastructure will not be only connected power. It will be revenue per megawatt.

Power is the input. Tokens are the output. Software determines how efficiently one becomes the other. That is why “I own power” is not enough as a long-term thesis. Power without software optimization is like owning oil without a refinery: valuable, but not the highest-value part of the chain.

The best AI infrastructure companies will not only secure electricity. They will continuously improve scheduling, orchestration, utilization, inference serving, model optimization, networking, cooling, workload routing, reliability, and customer workflow integration.

That is exactly where Nebius is trying to go with Token Factory.

The real question is: who can turn power into tokens, tokens into workflows, workflows into customer lock-in, and customer lock-in into a compounding data/software/infrastructure flywheel?

That is where Nebius is better positioned than the market understands.

Nebius is not just trying to build capacity. It is trying to build the AI supply chain layer where customers come not only for compute, but for tokens, tools, workflows, and eventually agentic infrastructure.

That is a much more durable position.

Why Nvidia Has Every Reason to Back Nebius

The second key piece of the thesis is Nvidia.

As hyperscalers increasingly diversify chips and invest more in proprietary silicon, they are becoming, de facto, competitors to Nvidia.

Google has TPUs. Amazon has Trainium and Inferentia. Microsoft is working on its own chips. Meta is developing more internal AI silicon. Tesla and xAI may eventually go deeper into custom silicon as well.

This does not mean Nvidia is weak. Nvidia’s moat is still extremely strong.

But it does mean this will not be a winner-takes-all chip market forever.

Nvidia needs to defend its ecosystem, keep customers locked into CUDA, maximize distribution, and avoid becoming overly dependent on hyperscalers that are increasingly incentivized to reduce Nvidia dependence over time.

This is why neoclouds matter.

Recent news about Google and Blackstone creating a new AI cloud company confirms the direction. For Google, it is not enough to use TPUs internally and inside Google Cloud. It is building an external, neocloud-like distribution surface for its chips.

The probability that Amazon eventually does something similar is high. Elon could follow in a few years. AMD could do the same. Chinese players may eventually enter the American and European markets as well.

The implication is clear: chip companies need distribution.

And Nvidia has every incentive to support strong independent distribution partners more than ever.

Neoclouds like Nebius, CoreWeave, and IREN will be key to that strategy. Over time, they can become one of Nvidia’s most important go-to-market surfaces, especially for AI-native companies, model builders, enterprises, and software vendors that do not want to depend entirely on hyperscalers.

But not all neoclouds are equal.

The winner will not simply be the one that buys the most GPUs. The winner will be the one that creates the most customer stickiness.

Stickiness is not built by selling bare metal.

That is why I think IREN’s long-term upside depends on whether it can move beyond scarce power and build real software/workload stickiness. If it remains mostly a capacity seller, I think it will be structurally disadvantaged versus Nebius and CoreWeave.

IREN now seems to be trying to pivot toward a Nebius-like model, more focused on software edge and more targeted to end users instead of only hyperscalers. But in my view, the burden of proof is now on IREN to show it can build that ecosystem before Nebius and CoreWeave compound their lead.

The real race is probably between CoreWeave and Nebius.

CoreWeave has strong execution, scale, and market credibility. But Nebius is more clearly focused on building a lasting moat around the Token Factory ecosystem.

That is what makes Nebius different.

It is not only trying to be a GPU landlord. It is trying to become the place where AI-native companies build, deploy, and scale their token production.

This is, in my view, why Jensen has been so vocal about Nebius lately.

He understands that Nebius can become one of Nvidia’s most important distribution channels. Not because Nebius is the biggest today, but because Nebius is building the kind of customer relationship Nvidia needs in the next phase of the AI infrastructure race.

Nvidia also understands the quantitative side better than anyone.

If SemiAnalysis data shows that software optimizations and new Nvidia systems can create up to 50x more throughput per megawatt and 35x lower token cost than the previous generation for a relevant inference workload, then Nvidia’s objective is obvious: put those systems in the hands of operators that can actually turn them into scaled token factories.

Not every owner of power can do that.

The best partner for Nvidia is not the company with the loudest announcement about megawatts. It is the company that can convert Nvidia’s full-stack platform into customer workloads, high utilization, developer adoption, and recurring token demand.

This is why Nebius matters strategically: it gives Nvidia a partner that can turn full-stack hardware and software into recurring token demand, not just one-off GPU capacity.

That is why Q1 mattered.

Not because one quarter changes the thesis.

But because the quarter showed that the thesis is starting to play out:

Demand is accelerating.
Pipeline is exploding.
Contracted power is already ahead of plan.
Token Factory is landing real customers.
Software is being used as an enabler of compute consumption and customer lock-in.
Management understands the strategic importance of owning the compute stack while moving high enough in the stack to shape customer behavior.
Nvidia has every reason to support the strongest independent AI cloud distribution partners.

The market still wants to reduce Nebius to a simple AI infrastructure story.

I think that is wrong.

Nebius is not just building data centers.

Nebius is trying to build the token factory layer of the AI economy.

And if the economy really becomes increasingly AI-native, the biggest factory in the world will not be measured in square meters, machines, or human workers.

It will be measured in tokens.

Lorenzo2cents (L2C) take aways and performance

My core takeaway from this quarter is simple:

Demand is outpacing supply.
Nebius is scaling capacity fast.
AI-native customers are becoming enterprises.
Nebius is building a software moat.
And open-source models are an underrated tailwind because they accelerate tokenization and commoditize the model layer, pushing value into delivery.

If Nebius becomes the best “token factory” for robust performance at the cheapest price, then it is not “a neocloud”.

It is an emerging hyperscaler.

And that is exactly why I’m here.

As always, here’s the Deep Dive To Date (DDTD): Stock performance since my first deep dive and when I bought in on July 28, 2025, at $51.63.

4x DDTD

If you want access to:

My portfolio
all my trades in real time
my price targets and how I calculate them
Business Ontology content

Join the Business Ontology Community

My price targets will be shared in the Telegram Group and explained on my Youtube channel.

Please note that:

I can be and will be (hopefully not often) WRONG. This is just my personal strategy—NOT FINANCIAL ADVICE. I don’t know your financial or life situation well enough to give any recommendations. Please do your own due diligence and research. Don’t be LAZY.

Be the architect of your own destiny.

Ciao

Lorenzo

Discussion about this post

Ready for more?