Key Moments
Four CEOs on the Future of AI: CoreWeave, Perplexity, Mistral, and IREN
Key Moments
AI is accelerating the need for specialized compute infrastructure, with companies like CoreWeave and IREN building massive data centers by leveraging creative financing and abundant renewable energy, while Perplexity redefines user interaction and Mistral focuses on open-source, specialized models.
Key Insights
CoreWeave's average client contract is 5 years, directly contradicting claims of GPU obsolescence within 16-18 months.
Perplexity aims to be the most accurate AI, providing AI access to the internet, a browser, and eventually full computer access for tasks.
Mistral AI trains its next generation of frontier models with NVIDIA, emphasizing open-source models that can be specialized for enterprise clients.
IREN has secured 4.5 gigawatts of power in West Texas, enough to power the entire Bay Area annually, and uses 100% renewable energy for its data centers.
The cost of a million tokens has dramatically decreased from $32 for GPT-3 to nine cents, fueled by capitalism and competition.
Perplexity operates with positive gross margins on all revenue, as most of its income is from recurring subscriptions and efficient token usage.
CoreWeave's evolution from crypto to AI compute infrastructure
CoreWeave's journey began in 2017, initially mining crypto to leverage downtime from algorithmic hedge fund operations. Through several crypto winters, their risk management expertise allowed them to weather volatility. They soon diversified into CGI rendering and then batch computing for medical research. By 2020-2021, CoreWeave shifted focus to neural networks, donating GPUs to an open-source AI project to learn the principles of large-scale parallelized computing. This 'computational tuition' informed their strategy, leading to the development of a cloud purpose-built for AI. They focus on the infrastructure layer, living 'above NVIDIA GPUs but below the models,' integrating software and operations for this specific use case. Their client base expanded from early adopters like Inflection to hyperscalers and companies like OpenAI.
Debunking GPU obsolescence and the value of long-term contracts
Michael Intrator of CoreWeave dismisses claims of rapid GPU obsolescence as 'nonsense,' often pushed by short-sellers. He emphasizes that CoreWeave's clients, many with large balance sheets, sign contracts averaging five years. This demonstrates the continued commercial viability and demand for GPUs beyond short-term speculative cycles. Intrator likens the situation to used iPhones finding new life in developing markets, suggesting that older GPUs, while not bleeding edge, retain significant value for inference tasks, rendering, and new use cases. CoreWeave uses a six-year depreciation schedule, believing GPUs will last longer, and notes that the appreciating prices of older GPUs like A100s confirm their enduring utility.
The 'box' financing model: Securing hyperscale infrastructure through structured debt
CoreWeave has innovated in financing large-scale GPU infrastructure through a 'box' financing model. When a client like Microsoft commits to buying compute, CoreWeave creates a 'box' that encompasses the client contract, GPUs, and data center lease. This box governs cash flow, prioritizing payments for data centers, power, and then debt servicing (interest and principal). Any remaining funds return to CoreWeave. This vehicle provides confidence to sophisticated lenders, ensuring repayment. Through this model, CoreWeave raised $35 billion in 18 months, enabling rapid infrastructure build-out. A key economic benefit is that within 2.5 years of a 5-year deal, the capital expenditure is paid off, significantly reducing their cost of capital. This allows them to compete with hyperscalers over time and adapt to market demands by selectively taking on deals that fit their risk management profile.
Perplexity's pursuit of accuracy and the evolution to AI as the operating system
Aravind Srinivas of Perplexity frames the company's mission around being the most accurate AI. This focus on accuracy led them to grant AI access to the internet, then full browser capabilities (Comet browser), and now aims for AI to act as the computer itself through 'Perplexity Computer.' This involves an orchestration of various AI models (GPT, Claude, Gemini, etc.) acting as 'musicians' with different 'instruments.' Perplexity is developing 'Perplexity Personal Computer,' a hybrid model synchronizing with local hardware like Mac Mini for private data handling, while offloading complex tasks to the cloud. They envision AI becoming the operating system, shifting from programmatic execution to objective-based tasks, abstracting complexities like managing API keys or different model subscriptions. This 'Steve Jobs-like' end-to-end integration aims to lower the barrier for users to create bespoke software and businesses.
Mistral AI's open-source strategy and verticalized models
Arthur Mensch of Mistral AI emphasizes training the next generation of frontier models with NVIDIA, building on their previous collaboration. Their core strategy is to produce best-in-class open-source models, which can then be specialized for enterprise clients through products like 'Forge.' Mistral operates globally, with significant business and researchers in the US, France, UK, and Singapore. They focus on helping European companies adopt AI and specialize models for sectors like financial services, engineering, and physics, leveraging proprietary data. Mistral believes that open models allow deeper customization and better integration with a company's intellectual property compared to closed-source models. They also work on the orchestration layer, building bespoke business applications by modifying models and their surrounding 'harness.'
Data segregation and portable platforms for vertical AI training
Mistral addresses the critical need for data segregation in verticalized AI training. They deploy their technology—training tools, data processing services—directly onto customer infrastructure. This ensures that sensitive data never leaves the client's environment, alleviating security concerns for CIOs. Mistral sends engineers and scientists to work with subject matter experts within client organizations (e.g., ISML for image scanning) to identify training data needs and refine models. This portable platform approach, combined with expertise transfer, allows them to serve critical use cases in industries handling sensitive data, ensuring privacy and control.
IREN's infrastructure build-out: Power, scale, and renewable energy
Daniel Roberts of IREN highlights their proactive approach to building large-scale data centers, starting eight years ago with a focus on high-performance computing. Initially bootstrapping with Bitcoin mining, they've now shifted to AI chips. IREN secured 4.5 gigawatts of power capacity in West Texas, using 100% renewable energy sources like hydro (British Columbia) and wind/solar (West Texas). This strategic location near excess renewable energy allows them to monetize low-cost power into digital commodities. Their flagship Texas site has a capacity of 750 megawatts, which was unprecedented when planned. They have a significant partnership with Microsoft, including a $9.7 billion contract, though this represents only 5% of their total capacity, indicating massive ongoing demand.
Bridging the digital-physical divide: Energy, labor, and supply chain challenges
IREN's data center expansion faces challenges primarily related to 'time to compute' rather than power availability, due to their foresight in securing land and grid connections. This involves mobilizing thousands of tradespeople for construction in remote areas like West Texas, straining local resources and supply chains for materials like memory. Roberts describes it as 'permanent whack-a-mole' to bring compute online. He notes the high demand for skilled labor, with electricians and construction workers commanding significantly increased salaries. IREN focuses on hiring locally, supporting communities, and retraining workforces, often leveraging existing industrial infrastructure in areas where manufacturing has declined. The industry also faces a shortage of skilled tradespeople, necessitating partnerships with trade schools and universities.
Mentioned in This Episode
●Products
●Software & Apps
●Companies
●Organizations
●Concepts
●People Referenced
Common Questions
CoreWeave started in 2017 focusing on crypto mining with GPUs. They then diversified into CGI rendering and batch computing for medical research, eventually moving into neural networks by donating A100 GPUs to EleutherAI to learn about large-scale parallelized computing, which launched their AI infrastructure business.
Topics
Mentioned in this video
Sponsor of the episode, described as a modern marketplace and massive platform for capital raising and long-term impact.
An open-source project that CoreWeave supported by donating A100 GPUs, which helped them understand the requirements for large-scale parallelized computing.
Another large company driving demand for compute, although the host notes they haven't 'figured out the consumer case yet' for AI.
Analogy used to describe the overwhelming demand for compute, comparing it to the high demand for tickets during the Patrick Ewing era.
A news organization, whose paid subscription should ideally allow AI agents to perform searches for users, making the service stickier.
A company previously involved in data training, specifically mentioned in the context of Facebook and customer data concerns.
An AI infrastructure company building large-scale compute for hyperscalers, specializing in purpose-built cloud for AI.
CoreWeave's first large commercial language model partner, showcasing their early involvement with foundational AI models.
A major foundation model client that CoreWeave diversified into, highlighting their role in scaling compute for leading AI developers.
Exception to the 6-year depreciation standard, indicating they might have a different approach to hardware lifespan.
A key client of CoreWeave, which uses their compute infrastructure for large-scale AI operations.
A major player in AI, noted for its massive cash flow and demand for compute, also praised for buying up fiber during bust cycles.
Mentioned as 'MIA' in the AI compute demand, possibly indicating a different strategic approach or less public activity in this area.
Cited as an example of a company built on declining storage and bandwidth costs creating new opportunities for content creation and sharing.
A company that lost customers due to concerns about sending data to Meta, relevant to data segregation and privacy.
An open-source model available on Perplexity's service, showcasing their support for a diverse range of AI models.
A leading AI company based in France, focused on training open-source frontier models and customizing them for enterprise customers through products like Forge and Studio.
An AI agent framework, recognized for its explosion in popularity among hackers and its potential for enterprise automation, but criticized for lacking enterprise-grade governance and observability.
A publicly traded company that started in Bitcoin mining and is now swapping out its Bitcoin infrastructure for AI chips, building large-scale data centers with a focus on renewable energy.
A business and employment-oriented social media service, also cited for its restrictive API access for AI bots.
A social news aggregation, content rating, and discussion website, mentioned for its restrictive API access for AI agents.
Announced a joint workstation with NVIDIA, featuring high RAM, suggesting a return to powerful desktop workstations for AI.
Referenced for its early consumer business trajectory, suggesting Perplexity's similar growth in user base.
Elon Musk's aerospace company, mentioned in the context of his vision for data centers in space.
A heavy manufacturing company, mentioned as a client that benefits from Mistral AI's specialized models for tasks like image scanning and defect detection.
A financial institution mentioned as a customer for whom Mistral AI provides solutions for critical processes like KYC (Know Your Customer) with deterministic and observable agents.
An investment company that is doing well in the data training space, mentioned as an example of companies specializing in refining AI models.
CEO of CoreWeave, who started the company in 2017 after running an algorithmic hedge fund, initially focusing on crypto mining with GPUs.
Co-founder of Inflection AI, who CoreWeave worked with for their first large commercial language model partnership.
A quant investors known for his bearish predictions, who expressed concerns about the AI industry, which Michael Intrator dismisses as 'nonsense'.
CEO of NVIDIA, referred to as 'Jensen', frequently discusses the accelerated compute advancements and the future of AI infrastructure.
Co-founder of YouTube, whose realization about declining storage and bandwidth costs enabled the platform's free content model.
Former CEO of YouTube (referred to as Susan 'what Jackie' and 'rest in peace' due to a speech error, but context implies Susan Wojcicki), who shared statistics about YouTube's massive upload volume.
CFO of OpenAI, who shared insights on the dramatic reduction in token costs from ChatGPT-3 to current models.
CEO and co-founder of Perplexity AI, interviewed about the evolution and strategy of his company.
CEO of OpenAI, mentioned for his significant fundraising efforts, highlighting the competitive landscape for smaller AI companies like Perplexity.
AI researcher, whose 'recursive thing' (referring to his work on large language models) inspired civilians to experiment with recursive AI.
Technologist and investor, who tweeted about Linux computers being the right idea in the context of Perplexity's release.
CEO of Reddit, addressed directly by the host with proposals for API access for AI agents with paid accounts.
CEO of SpaceX and X (formerly Twitter), mentioned for his ambitious vision of putting data centers in space, and merging AI with his other companies.
CEO of Anthropic, who noted the trend of AI models specializing, a key aspect of Perplexity's multi-model orchestration strategy.
CEO of Mistral AI, interviewed about their work on open-source frontier models and enterprise specialization.
CEO of Liberty Energy, mentioned in connection with the Trump administration's energy policies.
Former US President, whose administration's shifting stance on energy sources (from 'clean, beautiful coal' to 'all sources matter') is mentioned regarding nuclear power.
CEO of Microsoft, addressed directly by the host with a plea to allow LinkedIn (owned by Microsoft) to work with AI platforms like Perplexity.
Co-CEO and co-founder of IREN, interviewed about their transition from Bitcoin mining to AI data centers.
An early use case CoreWeave moved into after crypto, building projects for animation and image rendering.
A computing method CoreWeave utilized for medical research, applying GPUs to drive scientific advancements.
The observation that the number of transistors in an integrated circuit doubles approximately every two years; discussed in the context of how accelerated computing is surpassing its predictions.
Economic concept explaining that increased efficiency in resource use leads to increased overall resource consumption, applied to AI compute driving more demand.
The initial use case for IREN's large-scale data centers, which was later leveraged to bootstrap the platform for higher-value use cases like AI.
NVIDIA's GPU that CoreWeave purchased and donated to a group working on EleutherAI to learn how to use GPUs for neural networks.
NVIDIA's advanced GPU that CoreWeave was among the first to bring to commercial production at scale.
NVIDIA's next-generation GPU, which CoreWeave was also among the first to deploy at scale for commercial use.
NVIDIA's Grace Blackwell Superchip, CoreWeave being the first to bring it to scale, demonstrating their commitment to bleeding-edge technology.
NVIDIA's Grace Blackwell Superchip, mentioned as the latest architecture that CoreWeave is deploying at scale.
An older iPhone model used as an analogy to explain that older GPUs still retain value and find new use cases, particularly in developing markets.
Google's smartphone, also used as an analogy to illustrate the longer useful life of technology, similar to older GPUs still finding markets.
A family of computer networking technologies commonly used in local area networks (LANs) and data centers, discussed for its role in data moving within and between data centers.
A high-performance computer network communication link used in data centers, critical for low-latency communication between GPUs.
Apple's compact desktop computer, suggested as a local server for Perplexity Personal Computer to orchestrate private data locally.
Apple's high-performance desktop computer, used by the host to run local AI models like Kimmy 2.5.
A pivotal moment in AI, occurring after CoreWeave had already started scaling compute for larger clients, demonstrating widespread AI application.
A cloud service provider mentioned for its effectiveness in general-purpose web servers, contrasting with CoreWeave's specialized AI focus.
An early version of OpenAI's language model, noted for its initial high cost per million tokens, which has since drastically reduced.
Elon Musk's AI model from xAI, noted for its strong performance among competitors.
Operating system that Perplexity's server-side computer runs on, predicted to become the 'eventual winner' for AI workstations due to its stability and customizability.
An AI model (referred to as Alibaba 'Quen') used by Perplexity under the hood, showing their diverse model integration strategies.
A workspace software, also given root access to the host's AI agent for data access.
Google's mobile operating system; the host expresses a dream for a Perplexity app that could take root access to an Android phone.
Perplexity's core product, designed to provide accurate answers by giving AI access to the internet.
A feature in Perplexity that runs the same query across multiple AIs, shows where they agree/disagree, and identifies nuances.
A product by Mistral AI that allows deploying agents for end-to-end automation, often used by European companies to adopt advanced technology.
A time-management and scheduling calendar service, given root access to the host's AI agent for scheduling.
A browser product from Perplexity that allows AI to fully control a browser for task execution, making it accurate for web-based actions.
A concept for synchronizing Perplexity Computer's server-side execution with local hardware (like a Mac Mini) for privacy and local data handling.
A video conferencing software, given root access to the host's AI agent to integrate communications.
Perplexity's latest product, which gives AI full access to a computer, enabling it to perform tasks typically done by a human on a computer, acting as an orchestrator of various AI models.
Anthropic's coding model, specifically favored by backend engineers, highlighting the specialization of models within coding.
A family of large language models, including those by OpenAI, which Perplexity can orchestrate to provide optimal results.
An OpenAI coding model, favored by iOS engineers, further illustrating the specialization of AI models.
Apple's mobile operating system, mentioned in the context of Perplexity Comet's iOS app and the challenges of mobile device integration.
A language model by Anthropic, which Perplexity uses as part of its multi-model orchestration, also noted for its 'co-work' feature for repetitive tasks.
Perplexity Computer's integration with Slack, allowing users to interact with AI for back-office automation within their workspace.
A cloud-based word processor, part of the G Suite, that the host gave root access to their AI agent.
Google's powerful AI model, mentioned as a strong competitor in the AI landscape.
A local AI model running on the host's Mac Studio, noted for providing about 80% of the quality of larger models for free.
Google's suite of cloud computing, productivity and collaboration tools given root access to an AI agent.
An AI model available on Perplexity's service, demonstrating their commitment to multi-model orchestration.
A product by Mistral AI that helps customize their models for enterprise customers in specific domains like engineering, physics, and financial services.
Google's email service, used to summarize email conversations for an investment company using an enterprise AI agent.
Meta's open-source large language model, which Perplexity can also incorporate into its multi-model orchestration.
A suite of cloud computing, productivity and collaboration tools, which Perplexity's Computer can integrate with as a tool for automation.
More from All-In Podcast
View all 395 summaries
78 minHow Matt Mahan Thinks He Can Save California
67 minJensen Huang: Nvidia's Future, Physical AI, Rise of the Agent, Inference Explosion, AI PR Crisis
46 minJohn Fetterman: 'I'm the Only Democrat in Congress Saying This'
76 minTwo Legendary Founders: Travis Kalanick & Michael Dell Live from Austin, Texas
Found this useful? Build your knowledge library
Get AI-powered summaries of any YouTube video, podcast, or article in seconds. Save them to your personal pods and access them anytime.
Try Summify free