Imagen-NVIDIA's Roadmap Revealed: Blackwell, Dynamo, and the Future of AI Infrastructure
Introduction
At the recent GTC (GPU Technology Conference) keynote, NVIDIA founder and CEO Jensen Huang took the stage to share his vision for the future of artificial intelligence and computing. Speaking without notes or teleprompter at what has been dubbed the "Super Bowl of AI," Huang delivered a comprehensive overview of NVIDIA's technological advances and strategic roadmap. The presentation reflected the company's evolution from its GeForce gaming roots to becoming the driving force behind AI infrastructure worldwide.
This year's GTC was particularly significant as it highlighted the fundamental shift occurring in computing—from retrieval-based to generative computing—and showcased how NVIDIA is positioning itself as the architect of AI factories that will power the next generation of intelligent systems across every industry.
Key Points
- NVIDIA's Blackwell architecture offers 40x performance improvement over Hopper for reasoning AI workloads
- The company introduced Dynamo, a new operating system for AI factories that optimizes workload management
- New partnerships were announced with GM for autonomous vehicles and with Cisco, T-Mobile for edge AI
- NVIDIA revealed a multi-year roadmap including Blackwell Ultra, Vera Rubin, and Rubin Ultra architectures
- New enterprise AI products include the portable DGX Spark and liquid-cooled DGX Station workstation
- Breakthrough silicon photonics technology will enable scaling to millions of GPUs with reduced power consumption
- NVIDIA is open-sourcing its robotics foundation model Groot N1 and partnering with DeepMind and Disney on the Newton physics engine
The AI Revolution: From Perception to Reasoning
Huang began by reflecting on AI's evolution over the past decade. "It has only been 10 years," he noted, tracing AI's journey from perception (computer vision, speech recognition) to generative AI, which has fundamentally changed computing from a retrieval model to a generative model.
"Generative AI fundamentally changed how computing is done," Huang explained. "From a retrieval computing model, we now have a generative computing model... rather than retrieving data, it now generates answers."
The most recent breakthrough, according to Huang, is agentic AI—systems that can perceive context, reason about problems, and plan actions. At the foundation of agentic AI is reasoning, which enables AI to break down problems step by step, approach problems in different ways, and validate its own answers.
"Two years ago when we started working with ChatGPT, a miracle as it was, many complicated questions and many simple questions it simply can't get right," Huang said. "Now we have AIs that can reason step by step by step using a technology called Chain of Thought."
The Computational Challenge: 100x More Than Expected
This advancement in AI reasoning has created an unexpected computational challenge. Huang explained that the industry dramatically underestimated the computational requirements for modern AI:
"This last year, this is where almost the entire world got it wrong. The computation requirement, the scaling law of AI, is more resilient and in fact hyper-accelerated. The amount of computation we need at this point as a result of agentic AI, as a result of reasoning, is easily a hundred times more than we thought we needed this time last year."
The reason is simple but profound: reasoning AI generates substantially more tokens (the building blocks of AI text generation) as it works through problems step by step. Instead of generating one answer in a single shot, reasoning AI might generate thousands of tokens as it thinks through a problem.
Huang demonstrated this with an example comparing a traditional language model to a reasoning model solving a wedding seating arrangement problem. The traditional model produced 439 tokens quickly but got the answer wrong. The reasoning model generated nearly 9,000 tokens and arrived at the correct solution.
NVIDIA's Answer: AI Factories and Blackwell Architecture
To address this computational challenge, NVIDIA has reimagined data centers as AI factories—facilities dedicated to generating tokens that are reconstituted into various forms of information. At the heart of these factories is NVIDIA's new Blackwell architecture.
"Blackwell is in full production," Huang announced, showcasing the GB200 Grace Blackwell Superchip. The system represents a fundamental transition in computer architecture, moving from integrated MV-Link to disaggregated MV-Link, from air-cooled to liquid-cooled systems, and dramatically increasing component density.
"We have a one exaflops computer in one rack," Huang said proudly, describing the liquid-cooled system that contains approximately 600,000 parts—"like 20 cars worth of parts"—integrated into a single supercomputer.
The performance improvements are staggering. Compared to Hopper, Blackwell delivers 25x performance within the same power envelope for general workloads, and up to 40x improvement specifically for reasoning AI workloads.
Dynamo: The Operating System for AI Factories
To manage these complex AI systems, NVIDIA announced Dynamo, which Huang described as "the operating system of an AI factory."
"Whereas in the past, the way that we ran data centers, our operating system would be something like VMware... in the future the application is not enterprise IT, it's agents, and the operating system is not something like VMware, it's something like Dynamo."
Dynamo manages the complex distribution of AI workloads across GPU clusters, handling pipeline parallelism, tensor parallelism, expert parallelism, in-flight batching, disaggregated inferencing, and workload management. This open-source system optimizes both throughput and response time, helping AI factories find the optimal balance between generating tokens for many users and maintaining fast response times.
"The name Dynamo was chosen deliberately," Huang explained, drawing a parallel to the device that started the last industrial revolution. "The Dynamo was the first instrument that started the last industrial revolution... water comes in, electricity comes out."
NVIDIA's Roadmap: Scaling Up Before Scaling Out
Huang laid out NVIDIA's multi-year roadmap for AI infrastructure, emphasizing the company's philosophy of "scale up before scale out." The roadmap includes:
- Blackwell Ultra (second half of 2024): 1.5x more FLOPS, 1.5x more memory, and 2x more networking bandwidth than the current Blackwell.
- Vera Rubin (second half of 2025): Named after the astronomer who discovered dark matter, featuring a new CPU with twice the performance of Grace, new GPU, new networking, MV-Link 6, and HBM4 memory.
- Rubin Ultra (second half of 2027): MV-Link 576 with extreme scale-up capabilities, 15 exaflops per rack (compared to Blackwell's 1 exaflop), and 4,600 terabytes per second of scale-up bandwidth.
"This is another way of saying before you scale out, you have to scale up," Huang emphasized, showing how each generation offers exponential performance improvements.
Breaking Barriers with Silicon Photonics
To enable scaling to hundreds of thousands or even millions of GPUs, NVIDIA announced a breakthrough in silicon photonics technology. The company unveiled the world's first 1.6 terabit per second co-packaged optics (CPO) system based on micro-ring resonator modulator (MRM) technology.
Huang explained the problem with current optical transceivers: "Each GPU would have six transceivers, these six plugs would add 180 watts per GPU and $6,000 per GPU." At the scale of a million GPUs, that would mean "180 megawatts of transceivers that didn't do any math, they just move signals around."
NVIDIA's solution eliminates the need for traditional transceivers by integrating photonics directly with electronics. The result is a dramatic reduction in power consumption and cost, enabling switches with a radix of 512 ports—something "simply not possible any other way."
"In a data center, we could save tens of megawatts," Huang noted. "Six megawatts is 10 Rubin Ultra racks... 60 megawatts is 100 Rubin Ultra racks of power that we can now deploy into Rubin."
Enterprise AI: Bringing AI to Every Company
While cloud service providers were the first to adopt AI at scale, NVIDIA is now focused on bringing AI to enterprises worldwide. Huang introduced a new line of computers designed for the AI era:
- DGX Spark: A portable AI system with 20 CPU cores, 128GB of GPU memory, and 1 petaflop of computation for $15,000. "This is clearly the gear of choice," Huang said, announcing that GTC attendees would have first access to reserve units.
- DGX Station: A liquid-cooled personal workstation featuring Grace Blackwell with 20 petaflops of performance, 72 CPU cores, and HBM memory. "This is what a PC should look like," Huang declared.
These systems will be available through OEM partners including HP, Dell, Lenovo, and ASUS, enabling enterprises to build their own AI infrastructure. NVIDIA is also revolutionizing the networking and storage pillars of computing with Spectrum-X for AI networking and GPU-accelerated semantic storage systems.
To power these enterprise AI systems, NVIDIA announced that its reasoning model, R1, is now completely open source as part of the NIMS (NVIDIA Inference Microservices) system. The company highlighted partnerships with numerous enterprises including Accenture, AMD, AT&T, BlackRock, Cadence, Capital One, DeloitteNYU, NASDAQ, SAP, and ServiceNow, all integrating NVIDIA technology into their AI frameworks.
Autonomous Vehicles and the Future of Robotics
NVIDIA's vision extends beyond traditional computing to autonomous vehicles and robotics. Huang announced a partnership with General Motors to build their future self-driving car fleet, with AI applied to manufacturing, enterprise operations, and in-vehicle systems.
Huang also highlighted NVIDIA's commitment to safety with its Halos initiative, which has had every line of code safety-assessed by third parties. "This is something I'm very proud of. It rarely gets any attention," he noted.
In the robotics space, NVIDIA introduced several groundbreaking technologies:
- Cosmos: A system that uses Omniverse to generate infinite virtual environments for training robots.
- Newton: A physics engine developed in partnership with DeepMind and Disney Research, designed specifically for fine-grain rigid and soft body simulation, tactile feedback, and motor skills training.
- Groot N1: A generalist foundation model for humanoid robots that was announced as open source during the keynote.
"Physical AI and robotics are moving so fast. Everybody pay attention to this space. This could very well likely be the largest industry of all," Huang predicted.
Conclusion: The Dawn of a New Computing Era
As Jensen Huang wrapped up his keynote, the message was clear: we are witnessing a fundamental transformation in computing, driven by AI's evolution from simple pattern recognition to complex reasoning systems. NVIDIA has positioned itself not just as a chip company but as the architect of the infrastructure that will power this new era.
"We talked about several things," Huang summarized. "One, Blackwell is in full production, and the ramp is incredible, customer demand is incredible, and for good reason because there's an inflection point in AI. The amount of computation we have to do in AI is so much greater as a result of reasoning AI and the training of reasoning AI systems and agentic systems."
The company's three-pronged approach—building AI infrastructure for the cloud, for enterprises, and for robots—reflects its vision of AI as a transformative force across every industry and aspect of human endeavor. With its ambitious roadmap and breakthrough technologies, NVIDIA is not just participating in the AI revolution; it's actively shaping it.
As the world faces challenges like worker shortages and increasing computational demands, NVIDIA's innovations in AI factories, silicon photonics, and robotics offer a glimpse of a future where intelligent systems augment human capabilities and drive productivity to new heights. The tokens of today are indeed becoming the building blocks of tomorrow's intelligence.
For the full conversation, watch the video here.