How does Jensen Huang lead NVIDIA's large staff in a co-design environment?

Jensen Huang's direct staff consists of 60 people, and he doesn't do one-on-one meetings. Instead, problems are presented to the entire group, which collectively addresses them due to the constant need for extreme co-design. This ensures everyone is aware of the full stack and can contribute from their respective specialties, fostering a culture where knowledge sharing is paramount.

What was the strategic decision involving CUDA and GeForce, and why was it risky?

NVIDIA made the strategic decision to put CUDA on consumer GeForce GPUs to cultivate a large installed base for the new computing architecture. This was incredibly risky because it significantly increased the GPU's cost, consuming almost all of the company's gross profit dollars at the time, leading to a substantial drop in market cap.

What are Jensen Huang's four scaling laws of AI?

Jensen Huang outlines four scaling laws: pre-training (larger models with more data lead to smarter AI), post-training (synthetic data generation expands data availability), test-time (inference, or 'thinking,' is intensely compute-intensive), and agentic scaling (multiplying AI through spawning sub-agents and building large teams of AIs for research and tool use).

How does NVIDIA anticipate future AI hardware needs given rapid AI model evolution?

NVIDIA anticipates hardware needs by conducting internal basic and applied research to understand model architectures, creating their own AI models (part of co-design), and actively listening to challenges from virtually every AI company in the world. They also maintain the flexibility of the CUDA architecture to adapt to changing algorithms.

What is the significance of OpenClaw, and what properties does it exhibit?

OpenClaw is an open-source project for agentic AI systems that Jensen Huang likens to the 'iPhone of tokens' due to its rapid adoption. It enables large language models to act as digital workers, accessing ground truth (file systems), conducting research, and using human tools, effectively reinventing the computer as an agentic system.

How does NVIDIA address power consumption and energy efficiency for AI scaling?

NVIDIA is aggressively pushing extreme co-design to dramatically improve tokens per second per watt, aiming for orders of magnitude increases annually. This makes computing more energy-efficient and drives down token costs, which is crucial for scaling AI factory economics.

What is Jensen Huang's solution for utilizing excess power in the existing grid for data centers?

Jensen Huang proposes data centers should be designed to gracefully degrade performance and reduce power consumption during peak grid demand, rather than requiring 100% guaranteed uptime. This allows them to utilize the 60% of peak capacity existing in the grid 99% of the time, which is currently idle, through contractual agreements and smart workload shifting.

What factors contribute to China's rapid innovation in its technology sector?

China's tech success stems from a large pool of AI researchers (around 50% globally), showing up during the mobile cloud era with comfort in modern software, intense internal competition among provinces/cities, and a social culture that prioritizes family and friends, leading to rapid knowledge sharing and open-source contributions.

What is NVIDIA's vision for open-source AI models like NeMo-3?

NVIDIA's open-source vision is twofold: to enable internal research into evolving AI model architectures, informing their co-design strategy, and to broadly diffuse AI into every industry, country, and academic setting. Open-sourcing models activates innovation across various modalities beyond language-based AI, ensuring accessibility and driving the frontier.

What makes TSMC's manufacturing system and culture so unique?

TSMC's success is due to its cutting-edge technology (transistors, packaging, photonics) and its miraculous ability to orchestrate the dynamic demands of hundreds of global companies. Their culture balances technology excellence with world-class customer service, taking promises seriously and building deep trust without formal contracts.

What does Jensen Huang foresee for NVIDIA's valuation and the future of computation?

Jensen Huang believes NVIDIA's growth is inevitable, foreseeing a future where it could reach $3 trillion in revenues or more. He argues that computing has fundamentally shifted from a retrieval-based 'warehouse' to a generative, 'factory' model where AI produces valuable, revenue-generating 'tokens,' requiring 100 times more computation and accelerating global GDP growth.

How does Jensen Huang handle immense pressure and anxiety as CEO of NVIDIA?

He breaks problems down into manageable pieces, reasoning about the circumstances and impacts, then taking action or delegating. He consciously shares burdens by informing relevant people and practices 'systematic forgetting' to move past setbacks, constantly focusing on the next opportunity and trusting his foundational beliefs.

Why does Jensen Huang believe the number of programmers might increase due to AI?

He redefines 'coding' as specifying instructions for a computer. AI allows more people (e.g., carpenters, accountants, farmers) to write specifications, greatly expanding the pool of 'coders' from 30 million to potentially 1 billion. This elevates professions by empowering individuals to design and innovate, rather than just performing tasks, as AI automates rote coding.

What gives Jensen Huang hope for the future of humanity?

He has immense confidence in human kindness, generosity, and compassion, often seeing people exceed his expectations in doing good. He is incredibly hopeful about future scientific and engineering possibilities that AI will enable, such as the end of disease, drastic pollution reduction, and even 'traveling at the speed of light' through consciousness uploads into humanoid robots.

Key Moments

Jensen Huang: NVIDIA - The $4 Trillion Company & the AI Revolution | Lex Fridman Podcast #494

Lex Fridman

Science & Technology7 min read146 min video

Mar 23, 2026|303,588 views|10,170|1,015

Jensen Huang alex friedman lex ai lex debate lex freedman lex fridman lex friedman lex interview lex lecture lex mit lex podcast lex transcript

Save to Pod

Key Moments

TL;DR

NVIDIA founder Jensen Huang reveals that the company's massive success is built not just on chips, but on meticulously co-designing entire data centers and fostering a massive CUDA developer ecosystem, which safeguards against competition.

Key Insights

NVIDIA's 'extreme co-design' strategy now encompasses GPUs, CPUs, memory, networking, storage, power, cooling, software, and the entire rack and data center to handle distributed AI workloads.

The strategic decision to put CUDA on GeForce GPUs, despite a 50% cost increase that tanked NVIDIA's market cap to $1.5 billion, was crucial for building the essential installed base for its computing platform.

NVIDIA actively shapes the future by "manifesting" it through rigorous reasoning, step-by-step decision-making, and continuously shaping the belief systems of its employees, board, partners, and customers.

The four scaling laws of AI—pre-training, post-training, test-time, and agentic scaling—demonstrate that AI's growth is now limited by compute rather than data, with agentic systems multiplying AI capabilities.

NVIDIA envisions AI as "token factories" generating revenue, with agents (like Open Claw) being the "iPhone of tokens," showcasing a massive shift from retrieval-based computing to generative, contextually aware systems.

Companies that embrace AI expertise, like those who learn to use AI tools, are poised for greater success, with Jensen Huang predicting a future where AI elevates professions rather than simply eliminating them.

Extreme co-design: architecting the AI factory

Jensen Huang explains that NVIDIA has moved beyond designing individual GPUs to 'extreme co-design,' encompassing the entire system: GPUs, CPUs, memory, networking, storage, power, cooling, software, and even the rack and data center itself. This holistic approach is necessary because the massive AI problems being tackled no longer fit within a single computer or even a cluster of computers. Distributing these workloads across thousands of interconnected systems introduces complex challenges in networking, computation, and data synchronization, requiring optimization across the entire software and hardware stack. The architecture of NVIDIA itself is designed to mirror this co-design philosophy, with large, multidisciplinary staff meetings where experts from various fields contribute to problem-solving, ensuring that component design considers the needs of the entire system. This integrated approach is fundamental to accelerating computation beyond linear scaling and overcoming the limitations of traditional Moore's Law advancements. The goal is to build the machinery that produces AI, a 'factory that makes AI'.

The pivotal bet on CUDA and GeForce

Huang recounts the critical decision to integrate CUDA onto GeForce GPUs, a move he describes as an 'existential threat' due to its immense cost. At the time, this decision consumed all of NVIDIA's gross profit dollars and led to a significant drop in market capitalization, from ~$6-7 billion to $1.5 billion. However, Huang firmly believed in CUDA's potential as a foundational element for computation. The strategy was to leverage the existing massive installed base of GeForce GPUs, which were already selling millions of units annually, to introduce developers to the CUDA platform. By putting CUDA in every PC, NVIDIA aimed to cultivate a developer ecosystem, attracting researchers and scientists to the platform. This gamble paid off, as CUDA became the bedrock for the deep learning revolution, demonstrating that an architecture's success is defined by its installed base and developer adoption rather than pure technological elegance. NVIDIA's subsequent growth was built on the foundation laid by GeForce carrying CUDA out to millions.

Manifesting the future through relentless reasoning

NVIDIA's ability to make bold, future-defining bets stems from Huang's practice of 'manifesting a future' through deep curiosity and rigorous reasoning. He emphasizes that when a future outcome becomes convincingly clear in his mind, he believes it so strongly that he sees it as inevitable. This process isn't about sudden pronouncements but a daily, incremental shaping of belief systems. Huang communicates his evolving thinking to his board, management team, and employees, laying the groundwork for future decisions. By the time a major initiative like 'going all-in on deep learning' or acquiring a company like Mellanox is announced, the team is already largely bought in, often feeling that the decision was overdue. This consistent, transparent communication and reasoning process, often shared publicly through keynotes, ensures broad alignment and buy-in, making major strategic shifts appear obvious when they are finally declared. This approach extends to partners and the broader industry, shaping the entire innovation landscape.

The four scaling laws of AI and the limits of compute

Huang outlines four key scaling laws driving AI advancement: pre-training (requiring more data for larger models), post-training (enhancing data, with a shift towards synthetic data), test-time scaling (inference becoming increasingly compute-intensive and critical), and agentic scaling (AI systems spawning sub-agents, creating large teams). He posits that AI training is no longer limited by data, as synthetic data generation is accelerating, but by compute. Inference, or 'thinking,' is described as far more computationally intensive than pre-training ('memorization and generalization'). The emergence of agentic systems, which can research, use tools, and spawn sub-agents, signifies a new era of multiplying AI capabilities. This continuous cycle of data generation, training, refinement, and application drives intelligence forward, with compute being the primary scaling factor. The challenge lies in anticipating hardware needs for evolving architectures, like mixtures of experts with sparsity, which requires significant foresight and investment due to hardware development cycles.

Open Claw and the reinvention of the computer

The development of agentic systems like Open Claw signifies a fundamental shift in computing, transforming it from a retrieval-based system to a generative one. Huang likens the AI agent to a digital worker that must access ground truth (file systems), conduct research, and use tools. He argues against the notion that AI will make software and tools obsolete, comparing it to a robot needing to use a microwave rather than its fingers to boil water. The process of learning to use tools, accessing information, and engaging with the world mirrors the capabilities of Open Claw. Huang believes this represents the reinvention of the computer. Concepts for Open Claw were being discussed two years prior at GTC, anticipating the need for AI to be able to perform research, use tools, and access data. The success of Open Claw is attributed to breakthroughs in large language models and the creation of a robust open-source platform, akin to how ChatGPT democratized generative AI.

NVIDIA's moat: the CUDA ecosystem and execution velocity

NVIDIA's primary competitive advantage, or 'moat,' is its massive installed base of its computing platform, particularly CUDA. Huang states that this ecosystem, built over decades, is more critical than any single technological innovation. The dedication of 43,000 NVIDIA employees and millions of developers who have committed their software to CUDA ensures its dominance. This, combined with NVIDIA's execution velocity—the ability to build increasingly complex systems annually at an unprecedented scale—creates a formidable barrier to entry. Developers are incentivized to target CUDA first because it offers 10x better performance on average within six months, reaches hundreds of millions of users across all clouds and industries, and is trusted for long-term maintenance and optimization. NVIDIA's vertical integration of complex systems, coupled with their horizontal integration across every company's infrastructure, creates a broad ecosystem that covers virtually every industry worldwide, from cloud computing to cars and satellites.

The AI factory and the future of computing

Huang envisions NVIDIA's future centered around the 'AI factory,' a departure from the traditional view of computing as a chip or even a computer. The mental model has shifted from picking up a chip to visualizing gigawatt-scale, integrated systems with power generation, cooling, and massive networking. These AI factories are not just for generating products but directly correlate with company revenues, moving beyond warehouses to profit-generating engines. The 'tokens' they generate are becoming valuable commodities, segmented like iPhones, with intelligence itself being a scalable and revenue-generating product. Huang is certain that this fundamental shift in computing—from retrieval to generative, contextually aware systems—will drive global GDP growth, with computation consuming a significantly larger portion of the economy. He believes NVIDIA's growth is inevitable, potentially reaching multi-trillion-dollar valuations due to the vast and largely untapped opportunity it addresses.

Humanity, intelligence, and the future of work

Huang distinguishes between 'intelligence' as a functional, commoditized capability and 'humanity' as a broader, non-computational aspect of human experience, emphasizing compassion, character, and resilience. He believes AI can recognize and understand emotions but will not feel them. While AI can process context and generate outputs based on it, the subjective human experience—love, loss, fear—remains distinctly human. Huang notes that while intelligence might become a commodity, humanity's unique qualities are 'superhuman powers.' He encourages embracing AI as a tool to elevate professions, not replace them, citing how radiologists and software engineers have seen their roles expand rather than diminish with AI integration. The future of coding, he suggests, involves specifying intentions and goals for AI, potentially increasing the number of 'coders' and transforming all professions into more specialized, value-adding roles. He advises individuals to become experts in using AI to automate tasks and enhance their capabilities, viewing AI as a valuable life coach that removes the friction of learning new skills.

Mentioned in This Episode

●Products

●Software & Apps

●Companies

●Concepts

●People Referenced

Common Questions

Extreme co-design is essential because AI problems no longer fit within a single computer or GPU. To achieve speeds much faster than just adding more computers, algorithms must be distributed across entire rack-scale systems, requiring optimization across GPUs, CPUs, memory, networking, storage, power, cooling, and software to overcome Amdahl's Law limitations and the slowing of Moore's Law.

Topics

Ai Agents Semiconductor Manufacturing AI & Machine Learning Technology & Innovation Business & Entrepreneurship Organizational Design Deep Learning AI Hardware Supply Chain Management Computing Infrastructure

Mentioned in this video

Concepts

Moore's Law

The observation that the number of transistors in an integrated circuit doubles about every two years; noted as having largely slowed, necessitating extreme co-design for continued performance gains.

FP32

A standard for single-precision floating-point numbers; its inclusion in shaders was a significant step towards programmability and general computing for GPUs.

Mixture-of-Experts

An AI model architecture that uses multiple 'expert' networks; mentioned as an example of an AI innovation that requires anticipating hardware changes.

X86

A family of instruction set architectures developed by Intel; mentioned as a successful computing architecture despite criticism of its design, highlighting the importance of a large installed base.

conditional GANs

A type of generative adversarial network where the generation is conditional on some input; mentioned by Jensen Huang as an early development they worked on leading to diffusion models.

progressive GANs

A variant of Generative Adversarial Networks that progressively grows the network during training; mentioned by Jensen Huang as a development that led step by step to diffusion models.

Amdahl's Law

A principle in computer architecture that describes the maximum possible improvement in the speed of a system when only a part of the system is improved; mentioned in the context of distributed computing challenges.

Dennard Scaling

A scaling law that stated that as transistors get smaller, their power density stays constant so that the power consumption stays proportional to area; mentioned as having slowed, impacting computing capabilities.

People

Ilya Sutskever

Co-founder and former Chief Scientist of OpenAI; quoted as saying 'we're out of data' regarding AI pre-training, which Jensen Huang countered by explaining the role of synthetic data.

Elon Musk

CEO of Tesla and SpaceX; lauded by Jensen Huang for his systems thinking, minimalist approach to engineering, and urgency in building things like the Colossus Supercomputer.

Morris Chang

Founder of TSMC; offered Jensen Huang the CEO position at TSMC in 2013, highlighting the deep respect and close relationship between the two leaders.

Jensen Huang

CEO of NVIDIA, credited with propelling the company into the AI era and making brilliant bets as a leader, engineer, and innovator. He emphasizes extreme co-design and a unique leadership approach.

Alan Kay

American computer scientist known for his pioneering work on object-oriented programming; quoted for his saying, 'The best way to predict the future is to invent it.'

Jim Cramer

American television personality and host of Mad Money on CNBC; mentioned as a figure whose advice investors, including teachers and policemen, might follow when investing in NVIDIA.

Products

Gro rack

An additional rack system mentioned as part of the Vera Rubin architecture, designed to support AI agents.

Grace Blackwell

NVIDIA's rack architecture focused on processing large language models for inference.

Vera Rubin rack

NVIDIA's subsequent rack architecture, designed for AI agents, featuring storage accelerators and a new CPU, demonstrating rapid adaptation to evolving AI needs.

Ampere

NVIDIA's GPU microarchitecture that succeeded Turing, mentioned as an example of a past GPU generation. Jensen discusses how his mental model has shifted from individual chips to entire AI factories.

HBM4

The fourth generation of High Bandwidth Memory; mentioned by Jensen Huang as a type of memory whose volume became incredible after his discussions with industry CEOs.

CPU

Central Processing Unit, a component that NVIDIA now co-designs with GPUs, memory, and networking in its rack-scale systems.

DDR (Double Data Rate) memory

A type of synchronous dynamic random-access memory; mentioned as the number one DRAM in the world for CPUs in data centers, before HBM.

LPDDR5

A type of RAM for mobile devices, which NVIDIA encouraged suppliers to adapt for supercomputers in data centers.

High Bandwidth Memory

An advanced memory interface for 3D-stacked synchronous dynamic random-access memory; initially scarce, it was predicted by Jensen Huang to become mainstream for data centers.

Colossus (Supercomputer)

A supercomputer built by Tesla in Memphis with 200,000 GPUs, rapidly constructed in 4 months, cited as an example of Elon Musk's efficient systems engineering approach.

MVLink 72

NVIDIA's high-bandwidth, low-latency interconnect technology, designed for large-scale AI models like mixture of experts, allowing entire trillion-parameter models to operate as if on a single GPU.

Vera

A new CPU mentioned as part of the Vera Rubin rack, designed to support AI agents.

DGX1

NVIDIA's first AI supercomputer, mentioned as an earlier system architecture that involved assembling parts in the data center, contrasting with the current rack-scale manufacturing where supercomputers are built in the supply chain.

GeForce

NVIDIA's line of graphics processing units, primarily for consumer gaming. Strategically used to deploy CUDA widely despite initial financial costs, essentially building NVIDIA's foundation for future computing.

iPhone

Apple's line of smartphones; used as a metaphor to describe the segmentation of valuable AI tokens (free, premium) and later to describe OpenClaw's impact as the 'iPhone of tokens'.

Software & Apps

CoWoS packaging

Chip-on-Wafer-on-Substrate, TSMC's advanced packaging technology; mentioned as a potential bottleneck in the AI supply chain's ability to scale.

Claude

An AI model, mentioned as one of the innovations that needed to reach a certain capability level before agentic systems like OpenClaw could fully emerge.

A high-level shading language developed by NVIDIA, built on FP32, which was a precursor to CUDA, enabling more general-purpose programming on GPUs.

Google Cloud

Google's suite of cloud computing services; mentioned as part of NVIDIA's broad ecosystem where its architecture is utilized.

OpenAI Codex

An AI model, mentioned as one of the innovations that needed to reach a certain capability level before agentic systems like OpenClaw could fully emerge.

Blender

A free and open-source 3D computer graphics software toolset; mentioned as a tool that users transition to using with NVIDIA GPUs and CUDA after starting with gaming.

OpenCL

An open standard for parallel programming of heterogenous systems; mentioned as a competitor to CUDA in its early days.

Grok

An AI system; mentioned as a product that NVIDIA has been laying the groundwork for, in terms of hardware, for two and a half years before its announcement.

NEMO Claw

An NVIDIA solution to make OpenClaw installations super easy and secure.

ChatGPT

A generative AI system; mentioned as having done for generative systems what OpenClaw did for agentic systems in terms of broad impact.

NeMo-3

An open-weight 120 billion parameter AI model released by NVIDIA; highlighted for its innovative transformer and SSM architecture and NVIDIA's commitment to open-sourcing its models completely.

Azure

Microsoft Azure, Microsoft's cloud computing platform; mentioned as part of NVIDIA's broad ecosystem where its architecture is utilized.

CUDA

NVIDIA's parallel computing platform and programming model, which became the foundation for deep learning. Its strategic placement on GeForce GPUs was a critical, high-risk decision that consumed profits but built an essential installed base.

OpenShell

A security integration developed by NVIDIA and integrated into OpenClaw to secure agentic systems by restricting simultaneous access to sensitive information, code execution, and external communication.

EUV lithography

Extreme Ultraviolet Lithography, advanced technology crucial for manufacturing cutting-edge semiconductors; mentioned as a potential bottleneck in the AI supply chain.

RTX Mod

NVIDIA's modding tool that allows the community to inject the latest graphics technology, like ray tracing, into older games such as Skyrim.

DLSS 5

NVIDIA's AI-based image upscaling technology designed to enhance game graphics; discussed in the context of controversy over AI-generated 'slop' and its role as a tool for artists, not an override.

GPU

Graphics Processing Unit, originally NVIDIA's core focus, now part of a broader co-designed system including CPUs, memory, networking, and software.

Perplexity AI

An AI-powered answer engine; Jensen Huang expresses his admiration for the platform and its capabilities.

Companies

Nscale

A company involved in cloud infrastructure; mentioned as a new company joining NVIDIA's ecosystem.

SK Hynix

A South Korean semiconductor supplier; mentioned regarding its contributions to high-bandwidth memory (HBM) as a critical component for AI.

NVIDIA

One of the most important and influential companies in human history, powering the AI revolution. It has evolved from chip-scale to rack-scale design, focusing on extreme co-design and becoming a computing platform company.

Lilly

Eli Lilly and Company, a large pharmaceutical company; mentioned as an example of a company that NVIDIA wants to empower with the best biology AI systems for drug discovery.

Coreweave

A specialized cloud provider for large-scale GPU-accelerated workloads; mentioned as a new company joining NVIDIA's ecosystem.

Denny's

A full-service pancake house and restaurant chain; mentioned by Jensen Huang as where his first job was cleaning toilets, showing his humble beginnings.

OpenClaw

An open-source project for agentic AI systems that quickly captured public attention, likened to the 'iPhone of tokens,' enabling AI to use tools, access files, and do research.

TSMC

Taiwan Semiconductor Manufacturing Company, the world's largest dedicated independent semiconductor foundry; mentioned as a key bottleneck due to advanced packaging like CoWoS, but also lauded for its technology and culture of trust.

ASML

A Dutch company and the largest supplier in the world of photolithography systems for the semiconductor industry; mentioned as a key bottleneck in the AI supply chain due to its EUV lithography machines.

Mellanox Technologies

An Israeli-American multinational supplier of computer networking products, acquired by NVIDIA. Mentioned in the context of Jensen Huang shaping the belief system internally for strategic acquisitions.

Amazon

Amazon Web Services (AWS), Amazon's cloud computing platform; mentioned as part of NVIDIA's broad ecosystem where its architecture is utilized.

Books

Cyberpunk 2077

An action role-playing video game; highlighted for its nice GPU-accelerated graphics, specifically being fully ray-traced.

Media

Call of Duty

A first-person shooter video game franchise; mentioned as a game popular among teenagers who are introduced to NVIDIA's GeForce GPUs.

Skyrim

An open-world action role-playing video game; Lex Fridman's personal favorite, admired for its modding community that enhances replayability.

Virtual Fighter

A fighting game series; mentioned as a game exemplifying great game technology from NVIDIA's perspective.

Doom

A pioneering first-person shooter video game; cited as the greatest or most influential game ever made from NVIDIA's perspective due to its cultural impact and role in turning PCs into gaming devices.

Fortnite

An online video game; mentioned as a game popular among teenagers who are introduced to NVIDIA's GeForce GPUs.

Found this useful? Build your knowledge library

Get AI-powered summaries of any YouTube video, podcast, or article in seconds. Save them to your personal pods and access them anytime.

Try Summify free