Key Moments
NVIDIA CEO Jensen Huang GTC 2026 Full Keynote
Key Moments
NVIDIA's GTC 2026: AI factories, "tokens as commodity," new architectures, & agent revolution.
Key Insights
Launched new AI factory concept and "tokens" as the new commodity, driving immense compute demand.
Introduced next-generation architectures (Grace Blackwell, Vera Rubin) and hybrid AI processing with Groq.
Revolutionized AI with 'Open Claw' as the 'operating system' for agentic AI, emphasizing security and enterprise readiness.
Expanded AI's reach into physical robotics and autonomous vehicles with advanced simulation and foundational models.
Emphasized NVIDIA's vertically integrated yet horizontally open approach, bolstered by extensive CUDA X libraries and an expanding ecosystem.
Highlighted the shift from training to inference as the primary driver of AI computing demand and revenue.
THE RISE OF TOKENS AND AI FACTORIES
Jensen Huang opened GTC 2026 by framing intelligence creation around "tokens" as the fundamental building blocks of AI. He introduced the concept of 'AI Factories' – a new type of industrialized production for these tokens. This vision positions tokens as the new commodity, driving an unprecedented demand for compute power and signaling a paradigm shift in how computing is consumed and monetized. The entire industry is now focused on optimizing these token production factories.
ARCHITECTURAL INNOVATIONS AND HYBRID PROCESSING
NVIDIA unveiled its next-generation computing platforms, including the Grace Blackwell and Vera Rubin systems, engineered for extreme performance and efficiency in AI workloads. A significant development was the integration with Groq's technology, creating a hybrid processing approach. This combines NVIDIA's strengths in high-throughput computation with Groq's specialized inference capabilities, particularly for critical, low-latency tasks like coding and agentic AI operations, aiming to push performance boundaries further.
OPEN CLAW: THE OPERATING SYSTEM FOR AGENTS
The GTC keynote introduced 'Open Claw,' presented as the operating system for agentic AI. This open-source framework standardizes the creation and deployment of AI agents, enabling them to perceive, reason, and act across digital and physical domains. NVIDIA is heavily investing in this ecosystem, offering a reference design 'Nemo Claw' for enterprise readiness, security, and privacy, positioning it as the next major platform shift comparable to Linux or the internet.
PHYSICAL AI, ROBOTICS, AND AUTONOMOUS SYSTEMS
NVIDIA is extending its AI revolution into the physical world with significant advancements in robotics and autonomous vehicles. Leveraging sophisticated simulation tools like Isaac Lab and Newton, the company is enabling the training of physically embodied AI. The integration of foundational models and the 'Alpamo' platform is pushing autonomous driving towards its 'ChatGPT moment,' with partnerships announced for robo-taxi deployment, marking a new era of physical AI.
VERTICAL INTEGRATION AND HORIZONTAL OPENNESS
Huang reiterated NVIDIA's strategy as vertically integrated yet horizontally open. This approach involves deep understanding and optimization across hardware, software libraries (like CUDA X), and application domains, ensuring accelerated computing is tailored for every industry. Simultaneously, NVIDIA maintains an open ecosystem, integrating its technologies with cloud providers, system makers, and software partners to make its platforms accessible and universally applicable.
THE INFERENCE INFLECTION AND THE FUTURE OF COMPUTING
A central theme was the 'inference inflection,' highlighting that AI's primary computational demand has shifted from training to inference. This shift drives the exponential growth in compute orders, with NVIDIA forecasting over $1 trillion in demand through 2027. The focus is on maximizing 'tokens per watt,' ensuring data centers, now considered token factories, operate at peak efficiency to meet this escalating demand and unlock new revenue streams.
ADVANCEMENTS IN DATA PROCESSING AND CONNECTIVITY
NVIDIA introduced foundational libraries like QDF for structured data and QVS for unstructured (vector) data, accelerating enterprise data processing. The company is also innovating in connectivity with technologies like NVLink 72 and Spectrum X co-packaged optics, crucial for building massive AI supercomputers. A new class of data center CPUs and storage solutions are being developed to handle the intense demands of AI agents and generative models.
THE AI FACTORY ECOSYSTEM AND DIGITAL TWINS
To manage the complexity of building and operating AI factories, NVIDIA launched the DSX platform. This leverages Omniverse for creating digital twins of AI factories, enabling simulation, design, and dynamic power management. Collaborations with partners like Siemens and PTC are crucial for this ecosystem, ensuring optimal energy efficiency and maximum token throughput by minimizing wasted power and optimizing operations across the entire infrastructure.
OPEN MODELS AND SOVEREIGN AI INITIATIVES
NVIDIA is fostering a diverse AI ecosystem through its 'Open Models' initiative, offering millions of open-source models across various domains like language, biology, and physics. They are also actively working with countries to build 'sovereign AI' capabilities, customizing foundational models like Neomotron to meet specific regional needs. This democratizes AI development and allows for specialized intelligence tailored to unique industry requirements.
THE CLOUD AND ENTERPRISE ADOPTION
NVIDIA's strategy heavily involves deep integration with all major cloud service providers (AWS, Azure, Google Cloud), acting as a customer acquisition engine by enabling accelerated workloads on their platforms. In the enterprise, the shift is towards agentic systems that will transform traditional IT into 'Agentic as a Service' companies, with every software company needing an 'Open Claw strategy' to leverage AI agents for enhanced productivity and customer offerings.
Mentioned in This Episode
●Products
●Software & Apps
●Companies
●Organizations
●Concepts
●People Referenced
NVIDIA GTC 2026 Key Takeaways for AI Factory Optimization
Practical takeaways from this episode
Do This
Avoid This
NVIDIA Inference Performance per Watt Evolution (Hopper vs. Blackwell)
Data extracted from this episode
| Architecture | Performance vs. Expected (Moore's Law Basis) | Tokens per Watt (relative) | Cost per Token | Key Innovations |
|---|---|---|---|---|
| Hopper H200 (previous generation) | 1.5x (expected) | Baseline | Higher | FP8 Transformer Engine, MVLink4 |
| Grace Blackwell NVLink 72 (current generation) | 35-50x (actual, vs. 1.5x expected) | 35x higher (or 50x) | Lowest in the world | MVLink72, MV FP4, Dynamo, Tensor RTLM |
AI Model Tiers and Revenue Potential per Million Tokens
Data extracted from this episode
| Tier | Throughput/Speed | Input Context Length | Price per Million Tokens | Revenue Benefit (relative) |
|---|---|---|---|---|
| Free Tier | High Throughput, Low Speed | 100,000 tokens | $0 | Attracts more customers |
| First Tier | Medium Throughput, Medium Speed | Increased | $3 | Baseline monetization |
| Next Tier | Higher Throughput, Higher Speed | Larger | $6 | Increased Value |
| High Tier | High Performance | Millions of tokens | $45 | Significant Monetization |
| Premium Tier (Future) | Incredibly High Speed | Very Long Input (research) | $150 | Maximized Value for critical paths |
NVIDIA AI Architectures and Compute Growth (2016-Future)
Data extracted from this episode
| Year | Architecture/System | Key Feature | Compute (Teraflops/Exaflops) | GPUs/Nodes | Compute Gain (vs. 2016) |
|---|---|---|---|---|---|
| 2016 | DGX-1 (Pascal) | First computer for deep learning | 170 teraflops | 8 GPUs, 1 NVLink | 1x |
| Volta | NVLink Switch | All-to-all bandwidth | N/A | 16 GPUs | N/A |
| 2020 | DGX A100 SuperPOD | Scale-up/scale-out, NVLink 3 | N/A | N/A | N/A |
| Hopper | FP8 Transformer Engine | Generative AI era, NVLink 4 | N/A | N/A | N/A |
| Blackwell | NVLink 72 | AI supercomputing system architecture | N/A | 72 GPUs | N/A |
| Vera Rubin | Agentic AI, NVLink 72 | CPU, storage, networking, security | 3.6 exaflops | 72 GPUs | 40 million times more compute (in 10 years) |
Common Questions
NVIDIA operates as a platform company with vertically integrated development (from chips to algorithms) and horizontally open integration (working with diverse partners). Their strategy focuses on accelerating domain-specific applications across various industries using their CUDA-X libraries and a growing installed base of GPUs.
Topics
Mentioned in this video
NVIDIA's platform encompassing numerous libraries and algorithms for accelerated computing, celebrating its 20th anniversary. It's integrated into every ecosystem.
A declarative language to query data, invented by IBM, forming a foundation of modern enterprise computing.
Microsoft's search engine, which NVIDIA helps accelerate as part of their partnership with Azure.
An open-source machine learning framework, for which NVIDIA is the only accelerator in the world that is 'incredible' on it.
A CUDA-X library for decision optimization.
A CUDA-X library mentioned for direct sparse solvers.
An AI code completion tool, mentioned as 'Codex' in the transcript, assisting software engineers at NVIDIA.
A high-performance C++ open-source data processing library mentioned for structured data.
Google's suite of cloud computing services, mentioned for BigQuery and Vertex AI acceleration, as well as a partnership with Snapchat.
NVIDIA's foundational library for accelerating structured data processing (data frames), integrated into platforms like IBM Watson X.
A CUDA-X library for computational aerodynamics.
A CUDA-X library for computational lithography.
A CUDA-X library for genomics, accelerating genomic analysis.
An AI-powered code editor, used by NVIDIA software engineers for assistance in coding.
A new algorithm mentioned alongside Dynamo as part of NVIDIA's efforts to optimize AI performance.
A data analysis and manipulation library for Python, mentioned as a platform for handling structured data.
IBM's platform, with its SQL engines being accelerated by NVIDIA GPU computing libraries, showcasing significant speedup and cost reduction for enterprises like Nestle.
An NVIDIA software for differentiable physics, co-developed with Disney and DeepMind for robotics simulation.
A Product Lifecycle Management (PLM) software managing SIM-ready assets from NVIDIA and equipment manufacturers in the DSX platform.
A leading simulation tool used in the NVIDIA DSX platform for testing external thermals of AI factories.
NVIDIA's AI for autonomous vehicles, enabling reasoning and safe operation across scenarios.
A framework for developing applications powered by language models, noted for a billion downloads for creating custom agents, joining the Neimotron coalition.
NVIDIA's next-generation graphics technology, using neuro rendering to fuse 3D graphics with generative AI for realistic and controllable content.
A cloud big data platform by Amazon Web Services, mentioned for processing structured data.
Google Cloud's serverless data warehouse, accelerated by NVIDIA for Google Cloud customers.
NVIDIA's first cloud partner, accelerating services like EMR, SageMaker, and Bedrock, and facilitating the deployment of OpenAI on AWS.
An NVIDIA platform or library for AI RAN (Radio Access Network), significant for telecommunications.
An agentic model which revolutionized software engineering by its ability to read files, code, compile, test, evaluate, and iterate.
An open-source system for automating deployment, scaling, and management of containerized applications, enabling mobile cloud, compared to Open-Claw.
NVIDIA's reasoning models for language, visual understanding, RAG, safety, and speech, part of its Open Models initiative.
Microsoft's cloud computing service, mentioned for its Fabric platform and in partnership with NVIDIA for confidential computing and AI foundry acceleration.
NVIDIA's foundational library for accelerating unstructured data processing (vector stores, semantic data, AI data).
A platform integrating NVIDIA's QDF and QVS libraries to accelerate data processing for the AI era.
Google's unified machine learning platform, which NVIDIA accelerates.
A brand new type of AI platform created in partnership with NVIDIA and Dell, capable of on-premise and air-gapped deployments.
A CUDA-X library for geometry-aware neural networks.
An open-source project by Peter Steinberger, described as the 'operating system of agent computers' or 'personal agents,' revolutionizing enterprise IT.
NVIDIA's reference design for an enterprise-ready, secure, and private Open-Claw stack with policy guardrails and a privacy router.
One of the most important libraries created by NVIDIA, which revolutionized artificial intelligence and caused the 'big bang' of modern AI.
An AI chatbot by OpenAI that started the generative AI era, capable of understanding, perceiving, translating, and generating unique content.
Mentioned as an important internal AI consumption workload, shifting from traditional recommender systems to deep learning and large language models.
NVIDIA's platform designed to hold the world's digital twins, enabling virtual design and simulation of AI factories.
A platform for model-based systems engineering, used in conjunction with NVIDIA DSX for AI factory design.
A construction management software, used to virtually commission AI factories through NVIDIA DSX to ensure accelerated construction time.
An open-source operating system, compared to Open-Claw for its foundational impact on computing.
NVIDIA's open models for biology, chemistry, and molecular design.
A multimodal agentic system (AI search engine) recommended for its quality, joining the Neimotron coalition.
NVIDIA's new AI factory platform, an Omniverse digital twin blueprint for designing and operating AI factories for maximum token throughput, resilience, and energy efficiency.
NVIDIA's models for weather and climate forecasting, rooted in AI physics.
Technology integrated into Open-Claw to make it enterprise-secure and private-capable for sensitive corporate networks.
NVIDIA's open-source platform for robot training and evaluation in simulation, used by various companies for synthetic data generation and policy training.
NVIDIA's consumer GPU brand, which pioneered programmable shaders 25 years ago and laid the groundwork for CUDA.
NVIDIA's architecture for modern computer graphics, introduced 8-10 years ago, that fused programmable shading with hardware ray tracing and AI.
An NVIDIA GPU supercomputer, the first of which was installed at Azure, leading to the partnership with OpenAI.
NVIDIA's previous GPU architecture (e.g., H200), which revolutionized computing with its FP8 Transformer engine and MVLink 4, but is now superseded by Blackwell and Rubin.
An NVIDIA supercomputing capability, representing billions of dollars in investment, used to optimize kernels and the complete stack for inference.
NVIDIA's platform for co-packaged optics (CPO) Ethernet switches, increasing energy efficiency and resilience in AI factories.
A new LPU (Language Processing Unit) chip, part of the Fineman generation, developed by NVIDIA and Groq team together.
NVIDIA's automotive superchip, mentioned as being radiation-approved for space applications like satellites.
A re-architected system by NVIDIA, integrating 72 GPUs, which was a giant bet but delivered significant improvements in inference performance and energy efficiency.
The first GPU with the FP8 Transformer engine, which launched the generative AI era, using NVLink 4 and Bluefield 3 DPUs.
NVIDIA's future GPU architecture, projected to generate five times more revenue than Blackwell for AI factories.
A new CPU, short for Rosslyn, part of the Fineman generation, connecting with Bluefield 5 and SuperNIC CX10.
The first GPU supercomputer combining scale-up and scale-out architecture, using NVLink 3 and Quantum InfiniBand.
NVIDIA's next-generation SuperNIC, connecting with Bluefield 5 and the new Rosa CPU.
A new computer being developed by NVIDIA and partners to build data centers in space, designed to handle cooling challenges in a vacuum.
Introduced by IBM 60 years ago, it was the first modern platform for general-purpose computing, launching the computing era.
Introduced in 2016, the world's first computer designed for deep learning, featuring eight Pascal GPUs connected with first-generation NVLink.
Groq's LPU chip, which is integrated with Vera Rubin systems, manufactured by Samsung, and in full production, offering high-speed token acceleration.
A new rack system for Reuben Ultra, enabling connection of 144 GPUs in one NVLink domain, designed for vertical integration.
NVIDIA's next-generation scaling solution for Kyber racks, utilizing co-packaged optics for scale-up, alongside copper options.
NVIDIA's future GPU architecture after Rubin, featuring a new GPU, a new LPU (LP40), and a new CPU (Rosa).
NVIDIA's next-generation Data Processing Unit, connecting the new Rosa CPU with the SuperNIC CX10.
A new operating system for AI factories, invented by NVIDIA, that enables the disaggregation of inference workloads between different processors.
NVIDIA's Data Processing Unit, integrated into the Hopper and Blackwell architectures for improved networking and security.
A brand new CPU designed by NVIDIA for extremely high single-threaded performance, data output, data processing, and energy efficiency, using LPDDR5.
Pioneer in deep learning, whose work, enabled by GeForce, demonstrated the GPU's potential for accelerating deep learning.
Prominent AI researcher, whose work was enabled by GeForce GPUs in accelerating deep learning.
Considered the 'Godfather of AI' and a pioneer in deep learning, whose work was enabled by GeForce GPUs.
Founder and CEO of NVIDIA, giving the keynote speech at GTC. He emphasizes NVIDIA's strategy and technological advancements.
Co-founder of OpenAI and a pioneer in deep learning, whose work was enabled by GeForce GPUs.
NVIDIA's next-generation GPU architecture, mentioned with Grace Blackwell NVLink 72, delivering 35-50x performance per watt improvement for inference.
Mentioned in the Open-Claw video clip as launching 'research,' which seems to refer to a variant or application of agentic systems.
NVIDIA's extensible and GPU-accelerated differentiable physics simulation, used by Disney and other robotics developers.
NVIDIA's supercomputing system, combining Grace CPUs and Blackwell GPUs with NVLink 72, representing a huge leap in AI performance and efficiency.
Analyst from SemiAnalysis who accused Jensen Huang of 'sandbagging' NVIDIA's Grace Blackwell's performance, finding it to be even better than claimed.
NVIDIA's most advanced AI supercomputing platform, architected for agentic AI, featuring NVLink 72, 3.6 exaflops of compute, and five rack-scale computers.
Creator of Open-Claw, an open-source project for AI agents that rapidly became the most popular open-source project.
Venture capital firm where Alfred Lyn (NVIDIA's first VC) is associated.
Company of Sarah Go, mentioned as a pre-game show host. It likely refers to Conviction, a venture capital firm.
A heavy equipment manufacturer mentioned as a partner for robotic implementations.
Research division of Disney, using NVIDIA's physics simulator Newton and Isaac Lab to train policies across their character robots, like Olaf.
A research firm whose report confirmed NVIDIA's inference performance claims for Grace Blackwell NVLink 72, with analyst Dylan Patel even accusing Jensen Huang of 'sandbagging' due to exceeding expectations.
Mentioned as a customer and developer using NVIDIA technologies integrated into cloud services.
An AI safety and research company, partnering with NVIDIA, producing models that benefit from NVIDIA's confidential computing.
Cloud infrastructure provider where NVIDIA was their first AI customer, and now a key partner for AI cloud deployments.
Mentioned as a customer and developer using NVIDIA technologies integrated into cloud services.
An electronic design automation (EDA) company, a partner of NVIDIA that uses their acceleration for EDA and CA workflows.
A software company specializing in big data analytics, partnering with NVIDIA and Dell to create the Palantir Ontology Platform.
A large established company mentioned as part of NVIDIA's diverse ecosystem of partners.
The inventor of SQL and System/360, partnering with NVIDIA to accelerate Watson X Data SQL engines with GPU computing libraries.
A world-leading computer systems and storage provider, partnered with NVIDIA to create the Dell AI Data Platform.
Social media company that reduced its computing cost by nearly 80% by using NVIDIA accelerated Google Cloud services.
Mentioned as a customer and developer using NVIDIA technologies integrated into cloud services.
An inference service provider that experienced a 7x increase in token speeds (from 700 to 5,000 tokens/second) after updating to NVIDIA's optimized software.
A company that joined NVIDIA, contributing InfiniBand technology for scaling up and scaling out GPU supercomputers.
A platform company with three main platforms: CUDA-X, Systems, and AI Factories. They are vertically integrated and horizontally open, developing chips, systems, and software libraries for AI acceleration across numerous industries.
A global company that uses accelerated Watson X Data running on NVIDIA GPUs to refresh its supply chain data mart five times faster at 83% lower cost.
A consequential company from previous computing platform shifts, also a major cloud partner for NVIDIA.
A South Korean multinational automotive manufacturer, one of four new partners for NVIDIA's robotaxi-ready platform.
An automotive manufacturer already partnering with NVIDIA for robotaxi-ready platforms.
A collaborative robotics company working with NVIDIA to implement physical AI models into manufacturing lines.
A cloud-based data warehousing company, one of the platforms processing data frames.
Global IT services company mentioned as a user of the Dell AI Data Platform, experiencing huge speedups.
A large established automotive company, now a partner in NVIDIA's self-driving car platform.
A consequential company from previous computing platform shifts, also a partner using NVIDIA's AI compute.
Fine-tunes Groot models in Isaac Lab for their robotics applications.
A company providing a data and AI platform, mentioned as processing data frames.
Mentioned as a customer and developer using NVIDIA technologies integrated into cloud services.
A leading AI research and deployment company, whose compute-constrained models will be accelerated by NVIDIA on AWS and Azure, and who started the generative AI era with ChatGPT.
Described as the world's first AI native cloud, a company specifically built to host GPUs for AI clouds, partnering with NVIDIA.
A large established financial services company mentioned as part of NVIDIA's diverse ecosystem of partners.
A company with deterministic data flow processors (LPUs), whose technology was acquired and integrated into NVIDIA's Vera Rubin systems to enhance low-latency inference.
Manufacturer of the Groq LP30 chip, thanked by Jensen Huang for their production efforts.
An automotive manufacturer already partnering with NVIDIA for robotaxi-ready platforms.
A robotics company working with NVIDIA to implement physical AI models into manufacturing lines.
A telecommunications company partnering with NVIDIA for Aerial AI RAN, transforming radio towers into robotics radio towers.
A large established company mentioned as part of NVIDIA's diverse ecosystem of partners.
A consequential company from previous computing platform shifts, also a major cloud partner for NVIDIA.
A company providing Reality for internal thermal simulation in the NVIDIA DSX platform.
An imaging company joining NVIDIA's Neimotron coalition for sovereign AI, implying a focus on domain-specific models.
An AI company mentioned as part of the Neimotron coalition, producing incredible models.
An AI company from India, identified as 'Reflection Sarv' in the transcript, joining the Neimotron coalition.
A Chinese multinational manufacturing company, one of four new partners for NVIDIA's robotaxi-ready platform.
Uses Isaac Lab for training and data generation.
Taiwan Semiconductor Manufacturing Company, with whom NVIDIA invented the process technology for co-packaged optics (CPO) used in Spectrum X switches.
An engineering firm that brings data into their custom Omniverse app to finalize AI factory designs.
A company mentioned alongside 'Mirror Morardi's Lab' as joining the Neimotron coalition.
A Japanese multinational automobile manufacturer, one of four new partners for NVIDIA's robotaxi-ready platform.
A Chinese electric vehicle brand, identified as 'Ji' in the transcript, one of four new partners for NVIDIA's robotaxi-ready platform.
Ride-sharing company partnering with NVIDIA to deploy robotaxi-ready vehicles into their network across multiple cities.
A German manufacturer of industrial robots, working with NVIDIA to integrate physical AI models.
Trains their operating room assistant robot in NVIDIA Isaac Lab, multiplying their data with NVIDIA Cosmos World models.
Uses Isaac Lab to train whole-body control and manipulation policies for humanoid robots.
Uses Isaac Lab for training and data generation for their robots.
Uses Isaac Lab and Cosmos to generate post-training data for their skilled AI brain, hardening models with reinforcement learning.
An AI research lab, co-developed the Newton solver on NVIDIA Warp, enabling realistic physics for character robots.
NVIDIA's innovation in tensor core and computational unit that enables inference with gigantic boosts in performance and energy efficiency without loss of precision, also usable for training.
The observation that the number of transistors in an integrated circuit doubles approximately every two years, which NVIDIA argues has 'run out of steam' necessitating accelerated computing.
The markup language that started the internet, compared to Open-Claw for its foundational impact.
A beloved Disney character represented as a robot at GTC, demonstrating physics simulation and AI adaptation to the physical world using NVIDIA's platforms.
NVIDIA's frontier models for physical AI world generation and understanding, used with Isaac Lab for synthetic data generation.
NVIDIA's open robotics foundation models for general-purpose robot reasoning and action generation.
Found this useful? Build your knowledge library
Get AI-powered summaries of any YouTube video, podcast, or article in seconds. Save them to your personal pods and access them anytime.
Try Summify free