Imagen-NVIDIA Blackwell: 40x Performance Leap Redefines AI Computing Architecture
Introduction
The annual NVIDIA GTC conference has once again proven to be the epicenter of AI innovation, with CEO Jensen Huang delivering a comprehensive keynote that maps out the company's vision for the next generation of computing. This year's presentation, often referred to as the "Super Bowl of AI," showcased NVIDIA's latest breakthroughs in hardware, software, and AI capabilities that will power everything from data centers to autonomous vehicles and humanoid robots.
In his trademark leather jacket, standing on stage without a teleprompter or script, Huang walked the audience through NVIDIA's journey from the origins of CUDA and GPU computing to today's AI revolution, explaining how the company is addressing the exponentially increasing computational demands of modern AI systems. Unlike previous years, this GTC revealed a fundamental shift in how AI computation is understood and implemented, with inference workloads emerging as the most challenging and critical aspect of AI deployment.
Key Points
- Blackwell architecture delivers 40x performance over Hopper for AI reasoning workloads
- Agentic AI requires 100x more computation than previously anticipated
- NVIDIA introduced Dynamo, a new operating system for AI factories
- A comprehensive roadmap through 2027 includes Blackwell Ultra, Vera Rubin, and Rubin Ultra
- New Enterprise AI solutions include DGX Spark and DGX Station
- Silicon photonics breakthroughs will enable massive scale-out while reducing power consumption
- NVIDIA is expanding into robotics with Groot N1 and the Newton physics engine
The Evolution of AI: From Perception to Agentic Intelligence
Huang began by tracing AI's evolution over the past decade, from perception AI (computer vision, speech recognition) to generative AI (text-to-image, image-to-text), and now to agentic AI. This latest phase represents a fundamental advance where AI systems have agency—they can perceive, understand context, reason, plan, and take action.
"Agentic AI basically means that you have an AI that has agency," Huang explained. "It can perceive and understand the context of the circumstance, it can reason very importantly about how to answer or how to solve a problem, and it can plan and take action."
At the foundation of agentic AI is reasoning capability, which enables AI to break down problems step by step, consider multiple approaches, and verify its own answers. This reasoning process requires generating many more tokens than previous AI approaches.
Huang illustrated this with a demonstration comparing a traditional large language model to a reasoning model solving a wedding seating arrangement problem. While the traditional model generated 439 tokens and produced an incorrect answer, the reasoning model generated nearly 9,000 tokens to correctly solve the problem.
"The amount of tokens that's generated as a result is substantially higher," Huang noted. "Easily 100 times more."
AI Factories: The New Computing Paradigm
One of the most significant conceptual shifts Huang introduced was the notion of "AI factories"—data centers whose primary purpose is generating tokens, the building blocks of AI responses.
"The world is going through a transition in not just the amount of data centers that will be built but also how it's built," Huang explained. "Everything in the data center will be accelerated."
Huang described how computing has evolved from a retrieval-based model to a generative model: "Whereas in the past we wrote the software and we ran it on computers, in the future the computer's going to generate the tokens for the software. The computer has become a generator of tokens, not a retrieval of files—from retrieval-based computing to generative-based computing."
This shift has profound implications for data center architecture and economics. AI factories require extreme computing capabilities, with performance directly affecting quality of service, revenue, and profitability. The computational demands are staggering—each time an AI generates a token, it must process trillions of parameters.
Blackwell: The Next Generation GPU Architecture
To meet these unprecedented demands, NVIDIA unveiled Blackwell, its next-generation GPU architecture. Blackwell represents a fundamental transition in computer architecture, moving from integrated MV-Link to disaggregated MV-Link, from air cooling to liquid cooling, and dramatically increasing computational density.
"This is a big deal because we made a fundamental transition in computer architecture," Huang emphasized, showcasing the physical Blackwell system on stage.
The Blackwell architecture features:
- MV-Link 72, a high-performance interconnect enabling 72 GPUs to function as one
- Fully liquid-cooled design compressing compute nodes into a single rack
- One exaflops of computing power per rack
- 570 terabytes per second memory bandwidth
- 600,000 components per rack
Compared to Hopper, NVIDIA's previous generation architecture, Blackwell delivers 25x better performance within the same power envelope for inference workloads. When optimized with NVIDIA's new Dynamo software, Blackwell can achieve up to 40x performance improvement for reasoning models.
"This is Ultimate Moore's Law," Huang declared. "This is what Moore's Law was always about. 25x in one generation at ISO power."
Dynamo: The Operating System for AI Factories
To orchestrate the complex workloads running on Blackwell systems, NVIDIA introduced Dynamo, which Huang described as "the operating system of an AI factory."
"Whereas in the past, the way that we ran data centers, our operating system would be something like VMware... in the future the application is not enterprise IT, it's agents, and the operating system is not something like VMware, it's something like Dynamo," Huang explained.
Dynamo manages the intricate distribution of AI workloads across GPUs, handling tensor parallelism, pipeline parallelism, expert parallelism, in-flight batching, and disaggregated inferencing. It optimizes resources for different phases of AI computation—the "prefill" phase (thinking, reading, context processing) and the "decode" phase (token generation).
"That piece of software is insanely complicated," Huang acknowledged. The name "Dynamo" was chosen deliberately, referencing the device that started the last industrial revolution: "Water comes in, electricity comes out... Dynamo is where it all started."
NVIDIA's Roadmap: Scaling Up Before Scaling Out
Huang presented NVIDIA's multi-year roadmap for AI computing, emphasizing the company's philosophy of "scale up before you scale out." The roadmap includes:
- Blackwell Ultra (second half of 2024): 1.5x more FLOPS, 1.5x more memory, 2x more networking bandwidth
- Vera Rubin (second half of 2025): Named after the astronomer who discovered dark matter, featuring a new CPU with twice the performance of Grace, new GPU, new networking, MV-Link 144
- Rubin Ultra (second half of 2027): MV-Link 576, 15 exaflops per rack, 4,600 terabytes per second bandwidth
Huang clarified a terminology change going forward: "Blackwell is really two GPUs in one Blackwell chip. We call that one chip a GPU, and that was wrong... Going forward, when I say MV-Link 144, it just means that it's connected to 144 GPUs, and each one of those GPUs is a GPU die."
To enable this massive scale-up, NVIDIA announced breakthroughs in silicon photonics. The company unveiled the world's first co-packaged optics silicon photonic system based on micro-ring resonator modulator (MRM) technology, which dramatically reduces the power and cost of optical interconnects.
"In a data center, we could save tens of megawatts," Huang explained, noting that 60 megawatts of saved power translates to 100 Rubin Ultra racks that can be deployed for computation instead of networking overhead.
Enterprise AI: Bringing AI to Every Business
Beyond cloud data centers, NVIDIA is bringing AI capabilities to enterprises of all sizes with new hardware and software solutions:
- DGX Spark: A compact AI system with 20 CPU cores, 128GB of GPU memory, and one petaflop of computation for $30,000
- DGX Station: A liquid-cooled personal workstation featuring Grace Blackwell with 20 petaflops of performance
- NVIDIA NIMS: An enterprise-ready reasoning model that's completely open source
"100% of software engineers in the future—there are 30 million of them around the world—100% of them are going to be AI-assisted," Huang predicted. "100% of NVIDIA software engineers will be AI-assisted by the end of this year."
NVIDIA also announced partnerships with major enterprise technology providers including Cisco, T-Mobile, Dell, HP, Lenovo, and storage companies like NetApp and Pure Storage to build AI-ready infrastructure for businesses.
Physical AI and Robotics: The Next Frontier
The final segment of Huang's keynote focused on robotics, which he described as potentially "the largest industry of all." With the world facing a projected shortage of 50 million workers by the end of the decade, robots will play a crucial role in addressing labor challenges.
"We'd be more than delighted to pay them each $50,000 to come to work. We're probably going to have to pay robots $50,000 a year to come to work," Huang remarked.
NVIDIA announced several breakthroughs in robotics:
- Groot N1: An open-source foundation model for humanoid robots with a dual system architecture for "thinking fast and slow"
- Newton: A physics engine developed in partnership with DeepMind and Disney Research, designed for fine-grain rigid and soft bodies, tactile feedback, and fine motor skills
- Cosmos: A technology that generates infinite virtual environments for robot training
NVIDIA's robotics strategy follows the same three fundamental challenges as its AI approach: solving the data problem, developing the right model architecture, and establishing scaling laws.
Conclusion: NVIDIA's Vision for the Future
Huang concluded his keynote by summarizing NVIDIA's achievements and roadmap, emphasizing the company's commitment to advancing AI, enterprise computing, and robotics. The presentation showcased how NVIDIA has positioned itself at the center of multiple technological revolutions—from cloud AI to enterprise computing to physical robots.
The key takeaways from Huang's presentation include:
- Blackwell is in full production with incredible customer demand, driven by the inflection point in AI computation requirements.
- Blackwell MV-Link 72 with Dynamo delivers 40x the performance of Hopper for AI factory workloads.
- NVIDIA has established an annual rhythm of roadmaps to help customers plan their AI infrastructure investments.
- The company is building three distinct AI infrastructures: for the cloud, for enterprises, and for robots.
As AI continues to evolve and transform industries, NVIDIA's comprehensive approach—spanning hardware, software, and application-specific solutions—positions the company to lead the next wave of computing innovation. The technologies unveiled at GTC 2025 will likely shape how AI is deployed and utilized across virtually every sector of the global economy in the years to come.
With its bold vision and relentless innovation, NVIDIA is not just riding the AI wave—it's creating the technology that makes the wave possible in the first place.
For the full conversation, watch the video here.