A landmark agreement between Nvidia and Amazon Web Services is offering one of the clearest signals yet of where the artificial intelligence economy is heading—and how the balance of power between chipmakers and cloud providers is evolving.
Nvidia will supply AWS with 1 million GPUs between now and 2027, according to the company’s vice president of hyperscale and high-performance computing, Ian Buck. The timeline aligns with chief executive Jensen Huang’s projection of a $1 trillion revenue opportunity tied to its next-generation Blackwell and Rubin chip architectures.
While the headline number is striking, the structure of the deal is more revealing. This is not a simple hardware purchase. It is a full-stack infrastructure partnership that spans compute, networking, and increasingly, inference—the stage of AI deployment where models generate responses and perform tasks in real time.
Register for Tekedia Mini-MBA edition 20 (June 8 – Sept 5, 2026).
Register for Tekedia AI in Business Masterclass.
Join Tekedia Capital Syndicate and co-invest in great global startups.
Register for Tekedia AI Lab.
That distinction marks a turning point.
For much of the past two years, the AI boom has been defined by training—the process of building large language models using vast amounts of data and compute power. Nvidia’s dominance was built on supplying the GPUs required for that phase.
Now, the center of gravity is shifting toward inference. As AI systems move from development to widespread use, the demand profile changes. Instead of massive, one-off training runs, companies need sustained, efficient compute to serve millions—or billions—of user queries.
Buck captured the complexity of that shift bluntly: inference, he said, is “wickedly hard.”
To address it, AWS is not relying on a single class of chip. The deal includes a mix of Nvidia technologies—GPUs, Spectrum networking chips, and newer inference-focused processors such as Groq—alongside six additional Nvidia chip types. The goal is to optimize performance across different workloads, from large-scale model training to latency-sensitive applications like chatbots, recommendation engines, and autonomous systems.
This multi-chip approach reflects a broader industry reality. No single architecture can efficiently handle the full spectrum of AI tasks. Instead, hyperscalers are assembling heterogeneous compute stacks, combining different processors to balance cost, speed, and energy efficiency. That has implications for Nvidia’s long-term strategy. The company is no longer just a GPU vendor; it is positioning itself as a systems provider, integrating compute, networking, and software into a unified AI platform.
The inclusion of Nvidia’s ConnectX and Spectrum-X networking gear in AWS data centers is particularly significant. Traditionally, AWS has relied heavily on its own custom-built networking infrastructure, a core part of its competitive advantage. Opening that stack to Nvidia hardware suggests a deeper level of collaboration—and a recognition that AI workloads may require different architectural choices than traditional cloud computing.
It also signals a subtle shift in leverage. Hyperscalers like AWS have spent years developing in-house chips to reduce dependence on suppliers. Yet the scale and urgency of AI demand are forcing a more pragmatic approach: partnering with Nvidia even as they continue to build their own alternatives.
For AWS, the deal is about capacity and speed. Securing access to 1 million GPUs ensures it can meet surging customer demand for AI services, from startups building generative AI applications to enterprises embedding AI into core operations. But the agreement locks in long-term demand and reinforces Nvidia’s central role in the AI ecosystem. It also provides visibility into future revenue streams at a time when investors are closely watching whether the current AI spending boom can be sustained.
There is, however, a deeper competitive undercurrent.
The emphasis on inference chips—particularly newer offerings like Groq—suggests Nvidia is moving to defend its position against a growing field of specialized competitors. Startups and established players alike are targeting inference as a more cost-sensitive and potentially higher-volume segment than training.
If training established Nvidia’s dominance, inference will test its adaptability.
The economics are different. Training workloads are episodic and capital-intensive, favoring high-performance, high-margin chips. Inference workloads are continuous and cost-driven, requiring efficiency at scale. That shift could compress margins over time, even as total demand expands.
At the same time, the deal underscores the sheer scale of the AI buildout underway. A commitment of 1 million GPUs from a single cloud provider points to an infrastructure race that is still in its early stages. Data centers are being reconfigured, power consumption is rising, and supply chains are being stretched to meet demand.
This raises broader questions about sustainability—both in terms of energy usage and capital allocation. Hyperscalers are investing tens of billions of dollars in AI infrastructure, betting that demand will justify the outlay. Nvidia, in turn, is scaling production to meet that demand, tying its growth trajectory closely to the spending cycles of a handful of large customers.
The partnership with AWS illustrates how concentrated that ecosystem has become. A small number of companies—cloud providers, chipmakers, and large AI developers—are effectively shaping the architecture of the AI economy.
But AWS continues to develop its own chips, such as Trainium and Inferentia, aimed at reducing reliance on external suppliers. The coexistence of those efforts with large-scale Nvidia purchases reflects a dual strategy: build internally where possible, but buy externally where necessary to maintain competitiveness.
In that sense, the deal is both collaborative and competitive.
It locks Nvidia into the core of AWS’s AI infrastructure while reinforcing AWS’s role as a gatekeeper of AI services for enterprises. Each depends on the other, even as both seek to expand their own capabilities.
What emerges is a clearer picture of the next phase of the AI cycle. The initial scramble to build models is giving way to a longer, more complex process of deploying them at scale.



