Imagen-Next-Gen AI Infrastructure: NVIDIA Blackwell Architecture Sets New Performance Standards with 40x Improvement
Introduction
The annual NVIDIA GTC (GPU Technology Conference) has transformed from what was once described as the "Woodstock of AI" to what is now considered the "Super Bowl of AI." This year's event, hosted at the San Jose Convention Center, featured NVIDIA founder and CEO Jensen Huang delivering a comprehensive keynote presentation on the company's latest innovations and vision for the future of AI computing.
In a presentation delivered without scripts or teleprompters, Huang walked the audience through NVIDIA's latest breakthroughs in AI infrastructure, enterprise computing, and robotics. The keynote highlighted how AI has fundamentally changed computing from a retrieval model to a generative model, requiring massive computational resources and new approaches to system architecture.
Key Points
- AI computation requirements have increased 100x over past estimates due to reasoning capabilities requiring more tokens
- NVIDIA's Blackwell architecture delivers 40x performance over previous Hopper GPUs for AI inference workloads
- The new Dynamo OS optimizes AI workloads across GPUs for maximum efficiency in AI factories
- NVIDIA revealed a multi-year roadmap including Blackwell Ultra, Vera Rubin, and Rubin Ultra architectures through 2027
- New silicon photonics technology will enable scaling to millions of GPUs while saving megawatts of power
- Enterprise products include DGX Spark personal AI computer and DGX Station, bringing datacenter capabilities to individuals
- NVIDIA unveiled Groot N1, an open-source foundation model for humanoid robots, alongside Newton physics engine
The Evolution of AI and Computational Requirements
Jensen Huang began by reflecting on the remarkable progress of AI over the past decade. What started with perception AI (computer vision and speech recognition) evolved into generative AI (creating content across modalities) and has now entered the era of agentic AI—systems that can perceive, reason, plan, and take action.
"Agentic AI basically means that you have an AI that has agency," Huang explained. "It can perceive and understand the context of the circumstance, it can reason very importantly about how to answer or how to solve a problem, and it can plan and take action."
Huang emphasized that the industry severely underestimated the computational requirements for these advanced AI systems. The reasoning capabilities that make modern AI powerful require generating not just one token (the basic unit of AI text generation) but sequences of tokens representing steps of reasoning.
"This last year, this is where almost the entire world got it wrong," Huang stated. "The computation requirement, the scaling law of AI, is more resilient and in fact hyper-accelerated. The amount of computation we need at this point as a result of agentic AI, as a result of reasoning, is easily a hundred times more than we thought we needed this time last year."
To illustrate this point, Huang demonstrated how a reasoning model like Anthropic's Claude generates over 8,000 tokens to solve a complex problem (like arranging seating at a wedding), compared to just 439 tokens from a traditional language model that attempts to solve the problem in one shot—and gets it wrong.
NVIDIA Blackwell: The New AI Infrastructure
The centerpiece of Huang's presentation was NVIDIA's new Blackwell architecture, designed specifically to meet the enormous computational demands of modern AI. Blackwell represents a fundamental transition in computer architecture, moving from integrated MV-Link to disaggregated MV-Link and from air cooling to liquid cooling.
"This is a big deal because we made a fundamental transition in computer architecture," Huang explained, showing the audience a Blackwell compute node. "We wanted to scale up even further... The MV-Link switches are in this system embedded on the motherboard. We need to disaggregate the MV-Link system and take it out."
The result is the Grace Blackwell MV-Link 72 rack, which Huang described as "the most extreme scale-up the world has ever done." Each rack contains approximately 600,000 components, 5,000 cables spanning about two miles, and delivers one exaflop of computing power while being fully liquid-cooled.
Compared to the previous Hopper architecture, Blackwell delivers a staggering 25x performance increase at the same power consumption for AI inference workloads. When combined with NVIDIA's new Dynamo software, which optimizes workloads across GPUs, Blackwell achieves up to 40x performance improvement for reasoning models.
"Dynamo does all that. It is essentially the operating system of an AI factory," Huang explained. "We call it Dynamo for a good reason. As you know, the dynamo was the first instrument that started the last industrial revolution."
NVIDIA's AI Infrastructure Roadmap
In a rare move for the tech industry, Huang laid out NVIDIA's product roadmap for the next several years:
- Blackwell Ultra: Coming in the second half of 2024, featuring 1.5x more FLOPS, 1.5x more memory, and 2x more bandwidth than the current Blackwell.
- Vera Rubin: Named after the astronomer who discovered dark matter, this architecture will arrive in the second half of 2025 with MV-Link 144, featuring a brand new GPU, CPU, networking, and HBM4 memory.
- Rubin Ultra: Scheduled for the second half of 2027, this system will feature MV-Link 576 for extreme scale-up, with each rack containing 600KW of power and 25 million parts, delivering 15 exaflops of computing power and 4,600 terabytes per second of scale-up bandwidth.
Huang emphasized that this roadmap is being shared years in advance because building AI infrastructure requires extensive planning: "This isn't like buying a laptop. This isn't discretionary spend. This is spend that we have to plan on, and so we have to plan on having the land and the power, and we have to get our capex ready, and we get engineering teams, and we have to lay it out a couple two, three years in advance."
Silicon Photonics: Enabling Massive Scale
One of the most significant technological breakthroughs Huang announced was NVIDIA's advancements in silicon photonics. As data centers grow to the size of stadiums, traditional copper connections become impractical for long-distance data transmission. The challenge with silicon photonics has been the energy consumption and cost of transceivers that convert electrical signals to optical ones.
Huang introduced NVIDIA's first co-packaged silicon photonic system, featuring micro-ring resonator modulator (MRM) technology developed in partnership with TSMC. This technology dramatically reduces the power consumption and cost associated with optical networking.
"We invented the world's first MRM," Huang said, explaining how the technology works at a microscopic level to modulate light signals. The result is a system that can save tens of megawatts of power in a large data center—power that can instead be used for computing.
NVIDIA will ship its silicon photonic switch in the second half of this year, with the Spectrum-X version coming in the second half of next year. This technology will enable NVIDIA to scale to multi-hundred thousand and eventually multi-million GPU systems.
Enterprise AI and Personal AI Computing
Recognizing that AI is expanding beyond cloud data centers, Huang introduced new products designed to bring AI capabilities to enterprises and individuals:
"AI started in the cloud for a good reason... but that's not where AI is limited to. AI will go everywhere," Huang stated.
He unveiled the DGX Spark, a personal AI computer featuring 20 CPU cores, 128GB of GPU memory, and one petaflop of computation power for $15,000. This system is designed to be the development platform for software engineers, data scientists, and AI researchers worldwide.
For more demanding workloads, Huang introduced the DGX Station, a liquid-cooled workstation featuring Grace Blackwell architecture with 20 petaflops of computation power and 72 CPU cores. Both systems will be available through OEM partners including HP, Dell, Lenovo, and ASUS.
Huang also announced partnerships to revolutionize enterprise networking and storage. NVIDIA is working with Cisco to bring AI capabilities to network infrastructure, and with numerous storage companies to create semantics-based storage systems that embed information into knowledge rather than simply retrieving files.
"Rather than a retrieval-based storage system, it's going to be a semantics-based retrieval system, a semantics-based storage system," Huang explained. "The storage system has to be continuously embedding information in the background, taking raw data, embedding it into knowledge, and then later when you access it, you don't retrieve it—you just talk to it."
The Future of Robotics and Physical AI
The final segment of Huang's presentation focused on robotics, which he described as potentially "the largest industry of all." With the world facing a projected shortage of at least 50 million workers by the end of the decade, robots will play an increasingly important role in the global economy.
"The time has come for robots," Huang declared. "Robots have the benefit of being able to interact with the physical world and do things that otherwise digital information cannot."
Huang introduced Groot N1, a generalist foundation model for humanoid robots that features a dual system architecture for "thinking fast and slow." The slow thinking system allows robots to perceive, reason about their environment and instructions, and plan actions, while the fast thinking system translates those plans into precise robot movements.
To support the development of physical AI, NVIDIA announced Newton, a physics engine created in partnership with DeepMind and Disney Research. Newton provides realistic simulation of rigid and soft bodies, tactile feedback, and fine motor skills, allowing robots to be trained in super-real-time virtual environments.
In a dramatic conclusion to the keynote, Huang was joined on stage by a humanoid robot named "Blue," demonstrating the progress NVIDIA has made in robotics. He announced that Groot N1 would be open-sourced, making these advanced capabilities available to robotics developers worldwide.
Conclusion
Jensen Huang's GTC 2024 keynote painted a comprehensive picture of NVIDIA's vision for the future of computing—one where AI factories generate tokens that power everything from enterprise software to humanoid robots. The company's investments in silicon architecture, networking technology, software infrastructure, and physics simulation reflect a holistic approach to addressing the challenges of modern AI.
"We're building AI factories and AI infrastructure. It's going to take years of planning," Huang emphasized throughout his presentation. With its ambitious roadmap and technological breakthroughs, NVIDIA is positioning itself not just as a chip company but as the architect of the infrastructure that will power the next generation of artificial intelligence.
As the keynote concluded with a spectacular video demonstration of AI-generated visuals, Huang's message was clear: NVIDIA is committed to pushing the boundaries of what's possible in computing, and the journey is just beginning.
For the full conversation, watch the video here.