How does Michael Littman view social media in relation to AGI?

Michael Littman believes social media algorithms are already manipulating human behavior, acting as a 'super-dumb but super-powerful' collective intelligence. He sees its impact on younger generations as potentially damaging but also an opportunity for them to learn how to cope and find balance.

How did Michael Littman get into computer science and reinforcement learning?

He started programming BASIC on a TRS-80 Model I computer as a teenager. His interest in learning and behavior, sparked in psychology classes, led him to explore reinforcement learning. He later worked at Bellcore where he met Dave Ackley and Richard Sutton, solidifying his path in RL.

What was the significance of Q-learning for reinforcement learning?

Q-learning was a revelation because it was 'off-policy,' meaning an agent could learn about the optimal value of actions in an environment while simultaneously deciding how to behave optimally. This resolved many problems in earlier temporal difference learning theories.

Why did TD-Gammon's success not immediately generalize to other problems?

Despite TD-Gammon's remarkable performance, attempts to apply similar neural network architectures to other reinforcement learning problems often failed. Michael Littman attributes this to neural networks at the time struggling with consistency and reliability, and the unique 'neural net whispering' skill of its creator, Jerry Tesauro.

What is Michael Littman's perspective on AlphaGo and AlphaZero?

He was greatly impressed by AlphaGo's engineering feat in integrating various ideas to beat a human Go champion. While acknowledging AlphaGo Zero's purely self-play approach was remarkable, he views the initial success of AlphaGo in making quantum leaps in strategic knowledge as the primary 'doozy.'

Are there limits to AlphaZero's performance in games like Go?

While David Silver noted no discovered ceiling for AlphaZero's improvement, Michael Littman believes that the strategic depth of Go, while greater than chess, is not infinite. He suggests that what is observed is an asymptoting curve, where improvements yield diminishing returns until optimal play is reached for these finite games.

What are the limitations of language models like GPT-3, according to Michael Littman?

While impressive, GPT-3 is fundamentally limited because it learns only from existing text and does not interact with humans in a conversational, push-back manner. True language understanding requires interaction and occasional 'losses' to grasp nuance, which current models lack due to the high cost of human interaction data.

How does 'the bitter lesson' relate to Moore's Law and AI progress?

Richard Sutton's 'bitter lesson' suggests that simple algorithms leveraging exponential computational growth (like Moore's Law) yield greater long-term progress than complex, human-engineered AI. Michael Littman believes this exponential growth will eventually face friction, as development costs for new chips also double, hinting at a sigmoid curve rather than infinite growth.

Why is teaching self-driving cars human social interaction so difficult?

Driving is a highly social activity, requiring constant communication and 'theories of mind' about other drivers and pedestrians. Michael Littman observed while teaching his son to drive that accurately sending and interpreting these subtle signals is crucial, making it a much more complex problem for AI than initially thought.

What is Michael Littman's personal meaning of life?

Following the discovery that '42' is the answer from 'The Hitchhiker's Guide to the Galaxy,' Michael Littman believes the meaning of life is 'balance.' He emphasizes not having too much of anything and finding equilibrium in life's pursuits.

Key Moments

Michael Littman: Reinforcement Learning and the Future of AI | Lex Fridman Podcast #144

Lex Fridman

Science & Technology4 min read117 min video

Dec 13, 2020|97,951 views|2,227|215

michael littman artificial intelligence agi ai ai podcast artificial intelligence podcast lex fridman lex podcast lex mit lex ai lex jre mit ai

Save to Pod

Key Moments

TL;DR

Michael Littman discusses RL, AI's future, his creative process, and the nuances of intelligence.

Key Insights

Reinforcement learning is crucial for developing sophisticated AI that can navigate and learn from real-world interactions.

The existential threat of superintelligence is a compelling story but may be premature, as current AI development still requires significant human guidance and complex problem-solving.

Social media algorithms, while simpler than AGI, already exert significant control over human behavior, raising concerns about collective intelligence and societal direction.

Developing AI is an iterative process that benefits from human intuition, creativity, and specialized expertise, not just brute computational power.

The "bitter lesson" in AI suggests that simpler algorithms leveraging massive computation have historically yielded greater progress than complex, hand-crafted solutions.

The interaction and social dynamics of driving highlight the challenges in creating AI that truly understands nuanced human behavior and intent.

INSPIRATION FROM SCIENCE FICTION AND CREATIVE EXPRESSION

Michael Littman begins by discussing his early inspirations from science fiction, particularly the film 'Robot & Frank,' which presented a plausible near-term future of home robotics. He contrasts this with the current tendency for technologists to mold people to fit technology, advocating instead for technology that becomes integral to people's lives. This leads to a broader discussion on creativity, humor, and his enjoyment of making parody songs about computer science, likening them to a less production-intensive form of expression compared to commercial advertising.

THE NUANCES OF ARTIFICIAL GENERAL INTELLIGENCE AND EXISTENTIAL RISK

Littman expresses skepticism regarding the immediate existential threat posed by superintelligence, arguing that the path to AGI is more complex than simply scaling up current systems. He believes that developing AI capable of sophisticated interaction with the world will involve learning much about intelligence itself, providing opportunities for greater control and shaping. This contrasts with more alarmist views, suggesting that while the concern is valid, our current understanding and development trajectory may not lead directly to uncontrollable superintelligence.

THE EVOLUTION AND IMPACT OF REINFORCEMENT LEARNING

Littman traces his journey into reinforcement learning, beginning with his early interest in learning and behavior during his college years. He highlights the pivotal role of papers and interactions with researchers like Richard Sutton and Gerry Tesauro, especially the advancements in temporal difference learning and Q-learning. The success of TD-Gammon is presented as a significant milestone, showcasing the power of self-play and learning from predictions over time, even though applying these techniques to other problems proved challenging initially.

THE 'BITTER LESSON' AND THE ROLE OF COMPUTATION

Reflecting on the history of AI, Littman discusses Richard Sutton's 'bitter lesson' argument: that general-purpose algorithms leveraging massive computation have historically outperformed complex, knowledge-based systems. He relates this to his own experiences and the general trend in machine learning, where increased data and computational power often yield better results than intricate theoretical designs. This perspective raises questions about the fundamental nature of intelligence and whether it's more about elegant algorithms or simply the ability to process vast amounts of information.

SELF-PLAY AND THE LIMITS OF LANGUAGE MODELS

The conversation delves into the concept of self-play, exemplified by AlphaGo Zero and AlphaZero, and its application in domains like game playing. Littman acknowledges the impressive engineering and performance gains but questions the extrapolation to general intelligence. He discusses language models like GPT-3, noting their remarkable ability to imitate patterns but emphasizing their fundamental limitations without real-world interaction and pushback, suggesting that true understanding and intelligence require more than just statistical learning from existing data.

THE SOCIAL DIMENSION OF INTELLIGENCE AND LEARNING

Littman shares insights from teaching his children to drive, highlighting the crucial social interaction aspect that is often overlooked in AI development. He emphasizes that driving, and likely many other complex tasks, involves understanding and responding to the intentions of others—a 'theory of mind' that is difficult to replicate. This social complexity, coupled with the high cost of human interaction, presents a significant challenge for AI systems that rely on massive amounts of data and rapid learning cycles.

THE MEANING OF LIFE AND THE QUEST FOR BALANCE

The discussion concludes with Littman's reflection on the meaning of life, which he articulates as balance. He likens this to the iterative learning process in reinforcement learning, where agents learn through trial and error to find optimal states. He also touches upon the importance of human connection and purpose, drawing parallels to his earlier discussions on AI and intelligence, suggesting that understanding ourselves and our place in the world remains a fundamental pursuit, whether through technological exploration or personal reflection.

Mentioned in This Episode

●Products

●Software & Apps

●Companies

●Organizations

●Books

●Concepts

●People Referenced

Common Questions

Michael Littman is not particularly moved by the idea of an accidental superintelligence destroying human life. He believes that the process of developing sophisticated AI will inherently teach us how to control and shape it, rather than it spontaneously springing into existence unchecked. He contrasts this with Elon Musk's perspective.

Topics

Ai-Ethics Mindset & Self-Improvement AI & Machine Learning Science & Mathematics Human-computer Interaction Natural Language Processing AGI Safety Reinforcement Learning History Self-play Algorithms Moore's Law Limitations Autonomous Vehicle Challenges

Mentioned in this video

Media

2001: A Space Odyssey

A classic science fiction movie that could be used as a topic for discussing AGI.

Ex Machina

A science fiction movie that could be used as a topic for discussing AGI.

Robot & Frank

A near-term sci-fi movie about robots as home helpers, appreciated by Michael for its plausible future depiction and exploration of human-robot interaction.

Arrival

Movie based on a short story by Ted Chiang, which Michael Littman uses as an example of Chiang's work.

Westworld

A TV series discussed by Michael Littman and Charles Isbell.

People

Michael Jackson

Artist whose 'Thriller' music video costume Michael Littman mirrored for his overfitting parody video.

Billy Joel

Musician whose song 'Piano Man' was the basis for Michael Littman's parody about the Halting Problem, and whose music was a significant part of his youth.

Nick Bostrom

Author of 'Superintelligence' and a proponent of the AI existential threat argument.

Charles Isbell

A colleague of Michael Littman, mentioned in the context of Westworld discussions and a parody video. He's also known as 'Dr. Awkward'.

Chris Watkins

Researcher who visited Richard Sutton's lab and was excited by Q-learning, which resolved problems in earlier TD learning papers.

Andy Barto

Richard Sutton's collaborator and Michael Littman's Ph.D. mentor; his lab was where Q-learning was developed.

Magnus Carlsen

World chess champion who uses chess programs to train his mind, illustrating the co-evolution of human and AI intelligence.

Joseph Stalin

Historical figure suggested as a topic for a solo podcast episode.

Douglas Rushkoff

Author of 'Program Or Be Programmed,' which argued everyone needs to become a programmer to have a say in society.

Elon Musk

Entrepreneur who Michael Littman views as embodying the belief in the power of ideas, which leads him to naturally believe in extreme AI outcomes like existential threat.

Sam Harris

Neuroscientist and philosopher who shares a similar long-term view on AI existential risk as Elon Musk, focusing on the fundamental physics of the universe.

Richard Gerrig

Michael Littman's favorite psychology professor at Yale, whose classes involved deep dives into cognitive science topics.

Dave Ackley

First author of the Boltzmann machine paper, Michael Littman's mentor at Bellcore, and co-host of his current podcast.

Gary Marcus

Contemporary cognitive scientist known for his feisty critiques of deep learning and its limitations.

Satinder Singh

Influential reinforcement learning researcher at DeepMind and former student of Andy Barto, who was particularly impressed with AlphaGo Zero's ability to learn purely from self-play.

Michael Littman

A computer science professor at Brown University, specializing in machine learning, reinforcement learning, and artificial intelligence, known for his work in parody songs and AI research controversies.

Justin Bieber

Pop music artist whose songs Michael Littman came to enjoy through repeated listening, demonstrating neuroplasticity.

Dick Cheney

Former Vice President, whose alleged tactic of including himself on a list of candidates is humorously referenced by Michael Littman.

Fred Jelinek

Early computational linguist known for the quote that 'every time we fire a linguist, performance goes up by ten percent,' highlighting the power of data and compute over human-engineered knowledge in AI.

Groucho Marx

Comedian whose quote: 'If you're not having fun, you're doing something wrong' is used to end the podcast.

Adolf Hitler

Historical figure suggested as a topic for a solo podcast episode.

Joe Rogan

Podcaster mentioned as a source of 'wise sage device' regarding reading comments.

Justin Timberlake

Pop music artist known for a music video set at NeurIPS with robotics themes.

Richard Stallman

Founder of the free software movement and creator of GNU Emacs, described as a 'hell of a hacker'.

Cardi B

Pop music artist mentioned in the context of scholarly listening to pop hits.

Richard S. Sutton

Highly influential researcher in reinforcement learning, known for his TD (Temporal Difference) paper and his book on RL. Michael Littman met him early in his career.

David Silver

Lead researcher on AlphaGo at DeepMind, described as a 'neural net whisperer' for his ability to coax networks to solve complex problems.

Anca Dragan

An AI researcher who Michael Littman mentions as thinking deeply about human-AI interaction and the unsolved challenges of self-driving cars.

Steven Pinker

Cognitive scientist who co-authored a paper critically examining neural networks, similar to contemporary critiques of deep learning.

Jerry Tesauro

Researcher who had a huge impact on early reinforcement learning and showed it could solve problems previously intractable; known for his TD-Gammon work and ability to 'whisper' neural nets.

Geoffrey Hinton

Co-author of the Boltzmann machine paper, a pioneer in neural networks.

Taylor Swift

Pop music artist whose music Michael Littman came to like through repeated exposure, much like Justin Bieber.

Brian Christian

Author of 'The Alignment Problem,' which Michael Littman is currently reading.

Stuart Russell

AI researcher and author of 'Human Compatible: Artificial Intelligence and the Problem of Control,' which also influenced discussions on AI control problems.

Ted Chiang

Author of 'Exhalation' and the short story that became the movie 'Arrival,' noted for his science fiction driven by deep scientific and computer science insights.

Companies

RadioShack

Retail store where Michael Littman first saw and became fascinated by computers.

Simply Safe

A home security company mentioned as a podcast sponsor.

TikTok

Social media platform whose generation is tasked with figuring out how to cope with social media's impact.

Georgia Tech

University that helped produce Michael Littman's most elaborate parody video.

Bellcore

Michael Littman's first job out of college, where he worked with Dave Ackley and first encountered reinforcement learning.

Patreon

Platform for supporting the podcast.

Udacity

Online education platform that helped produce Michael Littman's most elaborate parody video.

Waymo

Google's self-driving car company, whose aggressive and fast cars made Lex Fridman revise his opinion on the difficulty of driving.

BetterHelp

An online therapy service with licensed professionals, mentioned as a podcast sponsor.

Facebook

Social media platform mentioned by Lex Fridman as less trustworthy than Wikipedia.

MasterClass

An online course platform mentioned as a podcast sponsor, offering courses from notable individuals.

YouTube

Platform where advertisers found Michael Littman's videos, leading to his commercial role.

Spotify

Platform where the podcast can be followed.

ExpressVPN

A VPN service mentioned as a podcast sponsor, used by Lex Fridman for privacy.

Twitter

Social media platform where "shitty" interactions occur but are managed by algorithms, which Lex Fridman views as potentially driving society towards better things in the long run.

DeepMind

AI research company that applied TD-Gammon's self-play algorithms to more complex games like Go.

OpenAI

AI research company that applied TD-Gammon's self-play algorithms to more complex games.

IBM

Company that developed Deep Blue, the chess-playing computer.

Tesla

Automaker known for its self-driving technology, which, like Waymo, made Lex Fridman reconsider the complexity of driving.

Software & Apps

An ancient board game that was once considered unsolvable by AI with traditional methods, but was conquered by DeepMind's AlphaGo.

Lisp

A programming language that can implement 'all of intelligence' in a single line of code, according to Lex Fridman.

GNU Emacs

A text editor that Michael Littman passionately defends as superior to Vim, attributing its power to its creator, Richard Stallman.

Apple Podcasts

Platform where the podcast can be reviewed.

AlphaGo

DeepMind's AI program that beat a human world champion at Go, seen by Michael Littman as a remarkable engineering feat.

Vim

A popular text editor that Lex Fridman jokingly tweeted was inferior to Emacs, sparking controversy.

Deep Blue

IBM's chess-playing computer that defeated world champion Garry Kasparov, mentioned as an example of a large company making significant AI investments.

AlphaZero

An extension of AlphaGo Zero that learned to play multiple games (Go, Chess, Shogi) purely through self-play and achieved superhuman performance, with no ceiling yet discovered for its improvement in Go or Chess.

Large Language Models

AI models with transformers, like GPT-3, that have revolutionized natural language processing, generating human-like text but potentially lacking true understanding or interaction.

BASIC

First programming language Michael Littman used to try and teach tic-tac-toe to his computer.

Boltzmann Machine

An early neural network model that could learn non-linear concepts, solving the XOR problem that perceptrons couldn't.

GPT-3

A specific large language model known for its highly human-like text generation, which Michael Littman argues doesn't necessarily mean high intelligence but rather that human everyday communication is often rote.

TD-Gammon

A computer backgammon program that learned to play at a world-class level using temporal difference reinforcement learning and self-play.

Books

Program Or Be Programmed: Ten Commands for a Digital Age

Influential book by Douglas Rushkoff arguing that everyone needs to learn programming to maintain agency in a technologically advanced society.

Human Compatible Artificial Intelligence and the Problem of Control

A book by Stuart Russell that Lex Fridman references in the context of the AI control problem.

The Bitter Lesson

A blog post by Richard Sutton arguing that the most significant advances in AI come from leveraging computation with simple, general methods rather than complex human-engineered knowledge.

The Alignment Problem

A book by Brian Christian currently being read by Michael Littman, covering AI fairness, reinforcement learning, and the superintelligence alignment problem.

Superintelligence: Paths, Dangers, Strategies

A book by Nick Bostrom on artificial superintelligence and existential risk which Michael Littman has read.

Exhalation

A short story collection by Ted Chiang, recommended for its insightful science and computer science-driven artificial worlds.

The Hitchhiker's Guide to the Galaxy

A humorous science fiction book that famously states '42' as the meaning of life, referenced by Michael Littman for his 42nd birthday party.

Products

PDP-11

Michael Littman's first computer, a TRS-80 Model I, got him interested in computer science in 1979.

Kinesis keyboard

An ergonomic, unusually shaped keyboard that Lex Fridman finds superior but makes it harder to use other keyboards.

Evian

Bottled water brand; Charles Isbell noted its name is 'naive' spelled backward.

TurboTax

Tax preparation software for which Michael Littman appeared in a commercial, emphasizing its ease of use for non-experts.

Roomba

Robotic vacuum cleaners that Michael Littman projects intelligence onto.

Organizations

University of Pennsylvania

Mentioned as a local Philadelphia school where some of Michael Littman's family went.

University of Massachusetts

University where Andy Barto's lab was located, where Richard Sutton worked, and where Chris Watkins visited.

NPR

Radio organization Lex Fridman loves but criticizes for its production constraints that hinder nuanced conversation.

Yale University

The college Michael Littman chose because it had a computer science major.

Brown

The university Michael Littman's family friend visited, leading to his commercial opportunity.

Princeton University

A college Michael Littman considered but ultimately rejected because it only offered computer engineering, not computer science.

Wikipedia

Online encyclopedia seen by Lex Fridman as a collective intelligence, preferable to social media if AGI embodied its values.

Brown University

The academic institution where Michael Littman is a computer science professor.

MIT

The academic institution where Lex Fridman will be giving lectures on machine learning.

Temple University

Mentioned as a local Philadelphia school where most of Michael Littman's family went.

Drexel University

University where Lex Fridman's father runs a large institute.

Concepts

Reinforcement Learning

A field of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward, a core focus of Michael Littman's research.

Moore's Law

The observation that the number of transistors in an integrated circuit doubles approximately every two years; Michael Littman discusses its potential limits due to increasing development costs.

Temporal Difference (TD) learning

A reinforcement learning method about making predictions over time, using observed reward and value estimates from future states to update the current state's value estimate.

Halting Problem

A fundamental problem in computer science about determining if any given program will finish or run forever; it was the subject of one of Michael Littman's challenging parody songs.

XOR problem

A classic problem in neural networks that perceptrons couldn't solve, but Boltzmann machines could, helping revive interest in neural networks.

Elo rating system

A method for calculating the relative skill levels of players in competitor-versus-competitor games, used to assess chess-playing ability.

Autonomous vehicles

Self-driving cars, discussed in terms of the challenges of real-world deployment, social interaction, and the contrast between academic and entrepreneurial approaches to development.

Q-learning

An off-policy reinforcement learning algorithm that learns the value of actions in specific states, allowing agents to learn optimal behavior regardless of the policy being followed.

Turing Test

A test of a machine's ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human.

Calvin and Hobbes

A comic strip referenced to explain Michael Littman's desire to see things from multiple perspectives.

trolley problem

A classic ethical thought experiment used to examine human moral systems, particularly in the context of autonomous vehicles.

Locations

Providence, Rhode Island

The city where Michael Littman lives and where Elon Musk gave a speech to governors about the dangers of AI.

Studies & Research

AlphaGo Zero

An advanced version of AlphaGo that learned to play Go purely through self-play, without human expert games, which Satinder Singh found breathtaking.

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free