Aravind Srinivas: Perplexity CEO on Future of AI, Search & the Internet | Lex Fridman Podcast #434

Lex FridmanLex Fridman
Science & Technology7 min read183 min video
Jun 19, 2024|930,310 views|14,705|1,010
Save to Pod

Key Moments

TL;DR

Perplexity CEO discusses AI's future, reimagining search with citations, and the pursuit of knowledge.

Key Insights

1

Perplexity operates as an 'answer engine,' combining search and large language models (LLMs) to provide direct, cited answers, minimizing hallucinations by grounding responses in human-created web sources.

2

The company's core philosophy, inspired by academic rigor, is that every statement should be backed by a source, a principle applied to AI to enhance accuracy and trustworthiness.

3

Perplexity aims to be a 'knowledge discovery engine,' guiding users through a curiosity-driven journey with related questions, rather than just providing a list of links.

4

Unlike Google's ad-driven model, Perplexity focuses on subscription revenue, allowing it to prioritize user experience and truthful answers over maximizing ad clicks.

5

The evolution of LLMs, from attention mechanisms and Transformers to scaling laws and post-training techniques like RLHF, has been crucial for their current capabilities.

6

Future AI breakthroughs may involve decoupling reasoning from facts, enabling smaller models to reason effectively, and leveraging iterative compute for self-improvement and new knowledge creation.

PERPLEXITY'S ANSWER ENGINE APPROACH

Perplexity redefines web search as an 'answer engine,' distinct from traditional search engines like Google. It integrates search capabilities with large language models (LLMs) to deliver direct, concise answers, all while meticulously citing original human-created sources. This academic-inspired approach aims to significantly reduce LLM hallucinations, making the information more reliable. By instructing the LLM to only use information found in retrieved, relevant paragraphs and to provide footnotes for every statement, Perplexity ensures factual grounding and transparency, mirroring the rigorous citation standards of academic writing.

FROM ACADEMIC ROOTS TO PRODUCT INNOVATION

The inception of Perplexity stemmed from a practical need to overcome the limitations of traditional search for complex, nuanced queries, particularly concerning factual accuracy. Co-founders Aravind Srinivas and Denis Yarats, both with academic backgrounds, observed that AI chatbots often provided incorrect information without substantiation. Inspired by the principle that every academic statement requires a citation, they applied this to AI, creating a system that forces the LLM to ground its responses in verifiable sources. This model addresses the challenge of AI accuracy by design, moving beyond mere chatbots to a more trustworthy knowledge discovery platform.

RETHINKING SEARCH: KNOWLEDGE DISCOVERY

Perplexity positions itself as a 'knowledge discovery engine' where the user's journey begins, not ends, with an answer. It emphasizes continuous exploration by suggesting related questions, fostering a deeper understanding of topics. This design philosophy, "where knowledge begins," encourages users to delve further into a subject, expanding their understanding rather than just satisfying an immediate query. It differentiates itself from Google by offering a curated, Wikipedia-like experience with direct answers, acknowledging that while Google excels at quick navigational searches, Perplexity focuses on facilitating insightful learning.

BUSINESS MODEL AND COMPETITIVE STRATEGY

Perplexity's business model diverges significantly from Google's ad-centric approach. Recognizing the high margins of Google's advertising system, Perplexity avoids direct competition in that space. Instead, it prioritizes a subscription model, which allows it to maintain a focus on user experience and the unbiased pursuit of truth, rather than generating ad revenue. This strategy reflects a broader trend among startups to identify and capitalize on areas where incumbent giants, constrained by existing high-margin businesses, are reluctant to innovate, akin to Amazon's entry into cloud computing with AWS.

USER EXPERIENCE AND LATENCY OBSESSION

Inspired by Larry Page's obsession with latency and user-centric design, Perplexity meticulously optimizes its user experience. The company focuses on minimizing response times ('time to first token') and streamlining interactions, even down to the speed at which a keyboard appears on a mobile device. This commitment extends to features like suggesting follow-up questions and auto-scrolling to answers, anticipating user needs. The philosophy that 'the user is never wrong' guides product development, ensuring that even poorly phrased queries yield high-quality, relevant answers, allowing users to be 'more lazy' while exploring knowledge.

INSPIRATIONAL LEADERSHIP AND FOUNDATIONAL THINKING

Aravind Srinivas draws inspiration from various tech leaders, adopting an 'ensemble algorithm' approach to leadership. From Jeff Bezos, he learns the importance of clarity of thought and customer obsession, noting Bezos's 'your margin is my opportunity' philosophy. Elon Musk inspires with his 'raw grit' and first-principles thinking, evident in his hands-on approach to problem-solving. Jensen Huang's constant drive for improvement and questioning conventional wisdom resonates, particularly in hardware innovation. These influences collectively shape Perplexity's aggressive pursuit of technological excellence and user value.

THE EVOLUTION OF LARGE LANGUAGE MODELS (LLMS)

The recent explosion in AI capabilities, particularly in LLMs, stems from several key breakthroughs. The concept of 'attention' and the Transformer architecture, pioneered in 2017, allowed for more efficient parallel computation during training. This, combined with insights on scaling laws—training larger models on vast, high-quality datasets—led to models like GPT-2 and GPT-3. Post-training techniques like Reinforcement Learning from Human Feedback (RLHF) were crucial for making these models more controllable and useful. This progression established that extensive data, optimized architectures, and continuous refinement are pivotal to LLM advancement.

RETRIEVAL AUGMENTED GENERATION (RAG) AND HALLUCINATION MITIGATION

Perplexity’s core technical architecture relies heavily on Retrieval Augmented Generation (RAG). This framework ensures that the LLM first retrieves relevant documents and paragraphs and then generates an answer, strictly adhering to the information found. This 'don't say anything you don't retrieve' principle is designed to minimize hallucinations by grounding AI responses in verifiable, human-created web content. Hallucinations can still occur due to model skill limitations, poor or stale indexed information, or an overwhelming amount of detail. Continuous improvement in retrieval, indexing, and model training is essential to further enhance accuracy.

WEB CRAWLING AND INDEXING CHALLENGES

Building Perplexity's index involves sophisticated web crawling mechanisms, akin to Google's. The 'Perplexity bot' navigates the web, making decisions on which URLs to crawl, their frequency, and handling modern web complexities like JavaScript rendering. It respects `robots.txt` files and politeness policies. Once content is fetched, it undergoes post-processing to create an ingestible format for ranking. This involves combining traditional information retrieval techniques like BM25 with more modern machine learning methods, acknowledging that no single approach, such as pure vector embeddings, fully captures the complexity of web pages and user intent. Domain knowledge is crucial for effective search.

CONTEXT WINDOWS AND REASONING BREAKTHROUGHS

The increasing length of LLM context windows, extending to millions of tokens, offers new possibilities for ingesting more detailed information. However, this also presents a trade-off with instruction-following capabilities, as increased data can sometimes confuse the model. Aravind believes that true breakthroughs lie in decoupling reasoning from facts, allowing models to learn effectively even with limited information, similar to an 'open book exam.' Researchers are exploring small language models (SLMs) trained specifically on reasoning-important tokens, distilling intelligence from larger models, which could dramatically reduce the computational resources needed for advanced reasoning.

THE FUTURE OF KNOWLEDGE AND TRUTH-SEEKING

Aravind envisions a future where AI tools like Perplexity enable rapid knowledge creation and widespread truth-seeking. He hopes that by making information more accessible and verifiable, AI can help humans reduce biases and foster a deeper understanding of the world. The 'Perplexity Pages' feature, allowing users to create Wikipedia-style articles from their Q&A sessions, exemplifies this vision. This democratized knowledge production, combined with AI's ability to simplify complex topics for various audiences, could lead to a more informed and rational global society, ultimately bridging divides and promoting peace.

THE ROLE OF COMPUTE IN AGI DEVELOPMENT

The quest for Artificial General Intelligence (AGI) is less about model weights and more about access to immense inference compute. Aravind highlights that while pre-training establishes foundational intelligence, the ability to apply iterative compute for fluid intelligence – continuous research, verification, and reasoning – is paramount. This capability, requiring vast GPU clusters, could lead to a concentration of power among those who can afford such computational resources, raising concerns about equitable access to advanced AI. He speculates on a 'recursive self-improvement' scenario where AI systems constantly learn and grow, driven by massive compute.

ENTREPRENEURIAL ADVICE: PASSION AND PURPOSE

For aspiring entrepreneurs, Aravind emphasizes following one's genuine passion rather than chasing perceived market demands. Building a company around an idea one loves provides the resilience needed to overcome challenges. He underscores the importance of a strong support system and the intrinsic motivation derived from seeing the product improve and serve users. Recognizing the 'good fortune' of building something impactful drives sustained effort. This dedication, akin to an athlete's 10,000 hours of practice, is crucial for fostering significant innovation and personal fulfillment.

HUMANITY, CURIOSITY, AND THE ABUNDANCE OF INTELLIGENCE

Aravind believes that despite the rise of advanced AI, human curiosity remains special and will be amplified by these new tools. Perplexity's mission aligns with enhancing human curiosity, recognizing that even with highly capable AGI, the inherent human drive to explore and understand will persist. He maintains an optimistic outlook, envisioning a future where an abundance of intelligence and knowledge leads to a more flourishing society, where work feels like play, and humans have more time for meaningful connections, ultimately fostering a more grateful and less scarcity-driven world. However, he acknowledges the complex ethical considerations surrounding AI, particularly regarding human-AI emotional connections and the potential for unintended consequences.

Common Questions

Perplexity is described as an 'answer engine' or 'knowledge discovery engine,' rather than a traditional search engine. While Google provides a list of links, Perplexity directly provides comprehensive, cited answers by synthesizing information from multiple sources on the web, significantly reducing LLM hallucinations. This is akin to how an academic writes a paper, backing every statement with a citation.

Topics

Mentioned in this video

productFalcon

SpaceX's rocket family, mentioned as an example of a design Elon Musk initially focused on but later pivoted from to Starship, highlighting the value of a 're-design for higher payloads' insight from a potential AI.

softwareTwitter (social graph)

Elon Musk's perceived desire for Twitter to have its own in-house data centers, reflecting a mentality of self-reliance.

personNovak Djokovic

The tennis player, cited as an example of an underdog who became objectively the 'GOAT' through hard work, not starting as the best, similar to Ronaldo's inspiring journey.

companyExcite

An early search engine that Google almost sold itself to, used as an example of a search engine that failed to understand user intent well, prompting Google's 'user is never wrong' philosophy.

personDimitri Bano

A graduate student in Bengio's lab who identified 'soft attention' and demonstrated its effectiveness over RNNs with less compute.

softwareGoogle bot

Google's web crawler, mentioned as a comparison point for Perplexity's own bot.

softwareLarge Language Model (LLM)

A type of artificial intelligence program designed to understand and generate human language. Perplexity integrates LLMs with search to produce cited answers.

productH100

NVIDIA's high-performance GPU, used as a benchmark to discuss the 30x efficiency improvement of the upcoming b100s for inference.

softwarePixelRNN

A DeepMind paper that demonstrated a fully convolutional model could do autoregressive modeling using masked convolutions, paving the way for efficient parallel training.

softwareLLaMA 3 70B

An open-source large language model from Meta, praised for being close to GPT-4 in capability and enabling more companies to innovate with powerful models.

softwareGPT (models)

Generative Pre-trained Transformer models, used by Perplexity in its early stages for tasks like generating research proposals for academic API accounts.

conceptTF-IDF (Term Frequency-Inverse Document Frequency)

An older but still effective information retrieval technique that assigns weight to terms based on their frequency in a document and inverse frequency across documents.

softwareSamantha (AI voice)

OpenAI's latest voice demonstration, noted for its flirty and human-like qualities, which raised questions about the nature of human-AI connection.

conceptSelf-Attention

A variant of attention where a model attends to different positions of a single sequence to compute a representation of the same sequence, central to the Transformer architecture.

conceptRetrieval Augmented Generation (RAG)

A technique that combines information retrieval with text generation, where relevant documents are retrieved and used to inform the generation of answers by a large language model. It's a foundational component of Perplexity.

softwarePerplexity Bot

Perplexity's web crawler, designed to crawl the web, respecting robots.txt files and dealing with modern web complexities like JavaScript rendering to build its index.

softwareTensorRT-LLM

A framework developed with NVIDIA, used by Perplexity to optimize its LLaMA-based models at the kernel level for high throughput and low latency.

productOverture
organizationAWS (Amazon Web Services)
toolTransformer
organizationSonar
toolCommon Crawl
toolCharacter.ai

More from Lex Fridman

View all 106 summaries

Found this useful? Build your knowledge library

Get AI-powered summaries of any YouTube video, podcast, or article in seconds. Save them to your personal pods and access them anytime.

Try Summify free