Why would AI companies choose Base 10 over major cloud providers like AWS or Google Cloud?

Companies choose Base 10 for specialized optimizations, enhanced reliability across multiple clouds, and a robust developer platform that handles the complexities of standing up an inference stack, which cloud providers typically leave to the user.

What are the economic advantages of using post-trained open-source models?

Post-trained open-source models can be 70-90% cheaper to run than frontier models. This cost saving is crucial for companies aiming to achieve profitability and scale, especially as they grow their user base and workload volume.

What is a 'post-training workflow', and how does Base 10 facilitate it?

A post-training workflow involves taking a base open-source model and fine-tuning it with specific data to optimize for a particular utility function or use case. Base 10 provides the scaffolding and infrastructure to manage this process from data input to inference.

Why is compute scarcity a persistent issue in the AI industry?

Compute scarcity is driven by the exponential growth in inference demand due to more agentic applications and larger AI models. Unlike finite resources like airport security, compute demand is continuous and compounding, with global demand ensuring no downtime.

Will Base 10 continue to rent cloud compute, or will they transition to owning their hardware?

Base 10 is transitioning towards owning its compute infrastructure. Rising rental costs and guaranteed access to sufficient hardware for future demand necessitate building and owning their data centers, starting with a strong relationship with chip providers.

What are the biggest risks facing Base 10?

The core risks for Base 10 are the potential lack of sufficient open-source models, the dominance of a few players controlling compute access, and not having enough compute to meet inference demand. Owning their infrastructure is key to mitigating the compute access risk.

What future business ventures is Tuhin excited about?

Tuhin is excited about investing in energy and power infrastructure to support the massive buildout of compute. He is also passionate about modular data centers, believing standardization of compute units could create an 'API for compute' and revolutionize the industry.

Key Moments

Stanford MS&E435 Economics of the AI Supercycle | Spring 2026 | Applications, Applied AI

Stanford Online

Education5 min read50 min video

Jun 5, 2026|23,501 views|420|14

Stanford Stanford Online Artificial Intelligence AI

Save to Pod

Want to know something specific about what's covered?

We've already dissected every moment. Ask and we will deliver (with timestamps).

Key Moments

TL;DR

The next AI inference boom will be 1 billion times larger than today, but companies must build custom models and infrastructure or risk falling behind frontier labs.

Key Insights

Inference demand is projected to increase by a billion times, necessitating massive growth in compute infrastructure.

Currently, 95% of inference spend is on frontier models, but Baseten believes profitable, defensible companies will leverage custom-trained open-source models.

Open-source models are approximately 90 days behind closed-source frontier models but can be 70-90% cheaper to run.

Compute access is identified as the primary strategic advantage for inference, leading Baseten to shift from renting to potentially owning compute resources.

Baseten's inference service currently handles around 30 trillion tokens per day, exceeding the volume of OpenAI's API and Google's Gemini.

The cost of compute is projected to double in some cases, with a future need for Baseten equivalent to $7 billion in compute spend within two years.

The coming billion-fold increase in AI inference

The AI landscape is on the cusp of an unprecedented surge in inference demand, projected to be a billion times greater than current levels. Tuhin Srivastava, founder and CEO of Baseten, highlights this exponential growth as a fundamental shift that will redefine the technology sector. This expansion underscores the critical role of inference, described as the 'cogs of AI value being delivered,' in powering the fastest-growing AI companies globally. The trajectory suggests a need for immense scaling in infrastructure and strategic shifts in how AI models are developed and deployed to capitalize on this supercycle.

The strategic advantage of custom-trained open-source models

While 95% of current inference spending is directed towards frontier models, Baseten's core thesis is that true defensible and profitable businesses will be built on custom-trained, or post-trained, open-source models. Srivastava explains that although open-source models typically lag behind frontier models by about 90 days, they offer a cost reduction of 70% to 90%. This economic advantage becomes crucial for companies aiming for profitability and healthy gross margins, especially as they scale. Furthermore, relying solely on frontier models risks handing over valuable user data and proprietary workflows to model providers, potentially undermining a company's unique competitive edge. By owning their intelligence through custom models, companies can build a more sustainable and defensible business, particularly as their user base and operational volume grow, making this transition less of a choice and more of an existential necessity.

Navigating the cloud and performance landscape

Companies like Baseten initially adopt a multi-cloud strategy, stitching together compute from various providers (currently operating across 18-20 clouds and 87 clusters) to ensure access and resilience. This approach abstracts away the complexities of sourcing GPUs, which are notoriously scarce. While major cloud providers offer their own inference platforms, Baseten believes its specialized software stack provides significant value in performance optimization, multi-cloud reliability, and developer tooling. Many clients first try raw cloud providers or AI clouds like CoreWeave or NVIDIA Nebula but often find significant pain in building their own inference stack on top of raw compute, leading them to solutions like Baseten. The performance of custom models, when optimized for specific use cases, is expected to improve user experience and drive higher latency and reliability, rather than degrade.

The escalating compute scarcity and the shift to ownership

Compute scarcity is not a temporary issue but a compounding problem. Demand for AI inference is surging due to increasingly agentic applications and larger models. This persistent demand, combined with extended lead times for acquiring GPUs (potentially 12-15 months out), is driving Baseten to re-evaluate its strategy. A recent example highlighted a dramatic price increase for B200 Blackwell chips, with renewal costs nearly doubling. This signifies that renting compute may soon become infeasible for large-scale operations. Baseten's own inference service, processing approximately 30 trillion tokens daily, is projected to require the equivalent of 150,000 B200s in two years, translating to a staggering $7 billion in compute expenditure. To secure this necessary capacity and mitigate risks, Baseten is moving towards potentially owning compute infrastructure, a move that is also projected to be about 30% cheaper than renting at scale.

Hardware diversity: NVIDIA's ecosystem dominance

While Baseten acknowledges the promise of diverse hardware architectures like TPUs and newer 'neo' chips, NVIDIA's ecosystem remains dominant. The vast majority of Baseten's fleet runs on NVIDIA chips, largely due to the mature CUDA developer ecosystem, extensive supply chain, and strong relationships with manufacturers like TSMC. New architectures often struggle to compete with the 'all-in-one' advantage of NVIDIA's integrated hardware and software stack, particularly its reliance on CUDA for frameworks like TRTLM, VLM, and SLang. While heterogeneous architectures are likely the future, NVIDIA's current grip on the market, fueled by its established infrastructure and developer community, makes it the pragmatic choice for companies prioritizing speed and agility in the current AI race.

The future of open-source models and national security

A critical bet for Baseten is the continued viability and quality of open-source models. Currently, the leading open-source models originate from China, prompting concerns about America's position in AI development. This situation is framed not just as a competitive disadvantage but also as a potential national security issue. The high cost difference (70-90% cheaper) and speed of open-source models compared to frontier models drive their adoption, especially for companies focused on profitability and defensibility. While companies like Meta have shifted away from open-sourcing some of their latest models, there's a recognition that robust open-source development in the U.S. is essential. Investments from companies like Google (with Gemma) and NVIDIA, alongside potential government involvement, suggest a growing effort to bolster domestic open-source AI capabilities, making it a matter of inevitability for national interests.

Building a modular future for data centers

Looking beyond Baseten, Srivastava suggests that the next frontier in AI infrastructure lies in standardizing compute units through modular data centers. Drawing an analogy to shipping containers that revolutionized global trade by normalizing the unit of cargo, modular data centers aim to create a standardized 'unit of compute.' This approach could significantly accelerate data center construction and operation by simplifying design, deployment, and maintenance. The current process for building data centers is highly customized and inefficient. By creating a modular, consistent format, an 'API for compute' could emerge, fostering an industry that can scale rapidly and efficiently, addressing the immense demand for processing power in the AI era. This involves focusing on energy and power infrastructure to support the build-out.

Mentioned in This Episode

●Products

●Software & Apps

●Companies

●Organizations

●People Referenced

Open Source vs. Frontier Models: Key Differences

Data extracted from this episode

Feature	Frontier Models	Open Source Models
Performance Lag	State-of-the-art	Approx. 90 days behind
Cost	Higher	70-90% cheaper
Defensibility	Potentially lower (risk of data extraction)	Higher (owning own intelligence)
Development Origin	Large AI labs	Primarily China, some US investment

Base 10 Compute Offering

Data extracted from this episode

Offering	Model	Pricing Example (per hour)
Rented Compute	NVIDIA B200s (Blackwell)	$263 (current rate)
Rented Compute Renewal (projected)	NVIDIA B200s (Blackwell)	$510 (projected rate)

Common Questions

Base 10 provides production inference infrastructure for AI companies. Their core offering helps these companies run highly optimized, custom AI models efficiently, focusing on performance, reliability, and a strong developer platform.

Topics

AI & Machine Learning Technology & Innovation Business & Entrepreneurship Data Centers Open-source AI Cloud Computing AI Infrastructure AI Business Models LLM Inference Custom AI Models Compute Hardware

Mentioned in this video

Companies

NVIDIA

A prominent technology company specializing in GPUs and AI hardware. Mentioned as a dominant player with a strong ecosystem (CUDA) and its significant investments in open-source initiatives.

Reflection AI

An AI company that reportedly aims to release good open-source models, contributing to the growing availability and advancement of open-source AI technology.

Cruso

A company previously featured in the class, run by Chase, which discussed the economics of building and owning data centers, representing a different approach to compute infrastructure.

Nubia

An AI cloud provider mentioned as a competitor in the market for inference infrastructure, alongside others like Coriv.

Google

A technology giant mentioned for producing the open-source model 'Jamma' and investing in the AI ecosystem, supporting the development of open-source AI.

Coriv

An AI cloud provider mentioned as part of the competitive landscape founders might consider before choosing Base 10 for their inference needs.

Alibaba

A Chinese multinational technology company, mentioned as a source of leading open-source AI models, contrasting with American contributions in this area.

Base 10

Tuhin's company, focused on providing production inference infrastructure for AI companies, enabling them to run custom models efficiently and cost-effectively.

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free