What is the 'reversal curse' in language models?

The 'reversal curse' refers to the phenomenon where language models trained on a fact in one direction (e.g., 'A is the director of B') struggle to answer questions that require reversing the relation (e.g., 'Who is the director of B?').

How does in-context learning differ from fine-tuning for generalization?

In-context learning often generalizes better than fine-tuning, especially for tasks like reversing relations or complex syllogisms. Models can answer reversals with high accuracy when data is in context, but perform poorly when fine-tuned on only one direction.

Why might in-context learning be better at generalization than parameter learning?

In-context learning may generalize better because information in context is available in richer detail and can be used more flexibly at test time, whereas parametric learning, while good at extracting statistical structures, can be more tied to how information was explicitly conveyed in training.

What are the trade-offs between offline augmentation and online retrieval for generalization?

Offline augmentation is compute-intensive at training but efficient at test time. Online retrieval is free at training but requires extra inference compute at test time. RL-based methods offer generalization beyond anticipated needs but also require extra compute at both training and test stages.

How might these generalization strategies relate to natural intelligence?

The strategies explored, like offline augmentation and online retrieval, parallel findings in neuroscience suggesting the brain uses both offline (e.g., hippocampal replay) and online methods to bridge generalization gaps, leveraging parametric learning and episodic memory.

What computational ingredients are needed for effective generalization?

Effective generalization requires consolidated information (cached statistics, facts, insights), procedures for flexible reasoning in context, and episodic memory to preserve specific experiences in rich detail for later flexible reuse.

How does context length affect language model generalization?

Context length is sensitive, but its impact depends on the information type. Models perform better with real-world entity information in context than with nonsense words, suggesting retrieval is content-sensitive and influenced by pre-training.

Can language models truly learn new knowledge from synthetic data?

Augmenting training data with synthetic traces doesn't create new information but extracts and makes explicit latent information already present in the data, improving the model's ability to generalize. It's about revealing implicit connections.

What are the limitations of the hippocampus-transformer analogy?

The analogy breaks down partly in implementation (hippocampus doesn't use transformer attention) and scale, as transformers can't fit a lifetime of experience in context. Additionally, the hippocampus engages more generative retrieval and confabulation, potentially offering more flexible generalization.

Key Moments

Stanford CS25: Transformers United V6 I Distinct Modes of Generalization from Parameters and Context

Q: Can language models overcome failures in latent generalization?

Yes, models can overcome latent generalization failures through strategies like offline augmentation (using in-context learning to generate reasoning traces and then fine-tuning) or online methods such as explicit retrieval from episodic memory or implicit retrieval learned via reinforcement learning.

Stanford Online

Education6 min read73 min video

May 20, 2026|651 views|25|2

Stanford Stanford Online Transformers AI Artificial Intelligence

Save to Pod

Want to know something specific about what's covered?

We've already dissected every moment. Ask and we will deliver (with timestamps).

Key Moments

On this page

TL;DR

Language models generalize poorly from fine-tuned parameters versus in-context learning, failing on tasks like reversing relations. Strategies to improve this include data augmentation and retrieval, bridging the gap seen in human learning.

Key Insights

Fine-tuning models on relation reversals (e.g., A directed B implies B directed A) causes them to perform worse than chance (26% accuracy) on the reversed relation, while placing the data in context allows 99% accuracy.

Models trained from scratch on 20,000 relationships fail to generalize to held-out reversals (0% accuracy) even with ample data, demonstrating the issue is fundamental, not just specific to fine-tuning.

In-context learning excels at flexible use of specific information, while parametric learning specializes in extracting statistical structures common across many documents, which can support in-context learning.

Augmenting training data by using in-context learning to generate reasoning traces and then fine-tuning on this combined data can achieve performance comparable to or better than in-context learning itself on tasks like syllogisms.

Episodic retrieval, even with imperfect precision, significantly improves generalization on reversal and codebook tasks compared to parametric learning alone, suggesting memory access is key.

Reinforcement learning can teach models to implicitly retrieve information from their training corpus at test time, improving generalization on tasks like syllogisms but showing less benefit for relation reversals.

Distinct generalization patterns emerge from parameters versus context

Language models learn information through two primary methods: optimizing parameters during training on a large corpus and learning from information provided directly in the prompt's context (in-context learning). While both methods aim to imbue models with knowledge and skills, research reveals striking differences in how these models generalize based on the learning route. A key observation stems from work on the 'reversal curse,' where models fine-tuned on a relation (e.g., 'Daphne Bingington is the director of A Journey Through Time') struggle to answer questions requiring the reverse relation (e.g., 'Who is the person who directed A Journey Through Time?'). In contrast, when the same factual information is presented within the model's context window, it readily handles such reversals. This suggests that information stored in network parameters and information processed in context are generalized differently, prompting a deeper investigation into these distinct generalizationabilities.

Relational reversals reveal poor generalization from parametric learning

Experiments comparing fine-tuned models against in-context learning models using datasets containing thousands of documents highlight significant generalization gaps. On a dataset based on the reversal curse, fine-tuning resulted in accuracy worse than chance (26%) for reversed relations, while placing the entire dataset in context yielded 99% accuracy. Similar patterns emerged with syllogistic generalizations, where fine-tuned models showed only a slight improvement over chance, whereas in-context learning performed better. This discrepancy is not merely an artifact of fine-tuning; training models from scratch on a large dataset of relationships also showed a complete failure (0% accuracy) to generalize to held-out reversals, even when the model could perfectly recall trained relations. This indicates a fundamental limitation in how parametric learning encodes and generalizes relational knowledge, contrasting sharply with the flexibility of in-context learning.

The role of statistical structures and implicit information

The superior generalization observed with in-context learning is partly attributed to the prevalence of structures on the internet that naturally demonstrate relation reversals and logical inferences. Models, therefore, learn to utilize these structures when information is provided in context. However, the failure of parametric learning to generalize these structures implies that models may not be effectively extracting latent information from training data. Training data often contains explicit facts alongside implicit, latent information (e.g., parent-child relationships implying ancestry). While in-context learning can flexibly use this information, parametric learning appears more tied to how information was explicitly conveyed, limiting its adaptability at test time. Nevertheless, consolidated parametric knowledge is crucial for enabling flexible in-context learning and for extracting statistical structures common across many documents.

Word co-occurrences can sometimes mask latent generalization failures

A critical finding is that language models can sometimes appear to generalize better than their failures in latent generalization would suggest. This is often due to leveraging word co-occurrences present in the training data. For instance, in syllogism tests, even if a model fails to perform a logical inference, it might still correctly answer a test statement if the relevant words (e.g., 'eagles,' 'wings,' 'fly') frequently co-occur in the training data, leading to a statistically plausible, albeit not logically derived, answer. This phenomenon highlights how parametric learning can find workarounds for its structural generalization weaknesses by exploiting statistical patterns. While useful on average, this reliance on co-occurrences can lead to incorrect generalizations in specific instances, illustrating that statistical learning is beneficial but imperfect.

Augmenting training data bridges the generalization gap

To address the limitations of parametric learning, strategies to improve generalization were explored. One offline approach involves data augmentation: using in-context learning to generate reasoning traces that elaborate on existing data, filling in missing links or reversing relations. This augmented data, along with the original training set, is then used for fine-tuning. This method has shown success, achieving generalization performance comparable to or even better than in-context learning on tasks like syllogisms. The key isn't discovering new information, but making implicit, latent information explicit and accessible to the model. This process reframes the idea of synthetic data generation, emphasizing the extraction and refinement of existing knowledge rather than the creation of novel facts.

Retrieval mechanisms enable flexible, adaptive generalization

Recognizing the impracticality of anticipating all future knowledge needs for offline augmentation, attention shifted to online, test-time strategies. Episodic retrieval, where relevant information is pulled from memory into the context, proved highly effective. Using an 'oracle' episodic memory system (which perfectly recalls relevant information but includes distractors), researchers found that while parametric learning failed on reversal and codebook tasks, systems with episodic retrieval generalized well. This demonstrates that bringing past experiences into context allows for more flexible use of information, supporting generalization in ways parametric learning alone cannot. This mechanism is thought to be analogous to how human episodic memory functions.

Learned, implicit retrieval via RL offers further generalization benefits

Building on retrieval, the possibility of models learning to implicitly retrieve information using reinforcement learning (RL) was investigated. The idea is that models could learn to regenerate necessary information within their chain of thought, effectively bringing it back into context. This RL-based approach showed generalization gains on syllogism tasks by training on one dataset and testing on another. However, it yielded less benefit for relation reversals, likely because generating the specific information needed for reversals is inherently difficult and may require broad enumeration rather than targeted recall. This suggests that learned retrieval can help with some generalization challenges but is not a universal solution.

Bridging AI and human intelligence through complementary learning systems

The observed patterns in language models resonate with theories of natural intelligence, particularly the complementary roles of the hippocampus (episodic memory) and the neocortex (parametric learning). The challenges faced by models in generalizing latent information from parameters echo potential difficulties in human learning, where explicit instruction can sometimes limit flexible application. Strategies like offline data augmentation and online retrieval in AI mirror potential mechanisms in human cognition, such as memory replay and proactive recall. This suggests that effective generalization, both artificial and natural, relies on a combination of consolidated knowledge (parameters) and flexible, context-aware access to specific experiences (in-context learning and episodic memory).

Mentioned in This Episode

●Software & Apps

●Companies

●Books

●Concepts

●People Referenced

Common Questions

Language models primarily learn in two ways: by optimizing network parameters over a large training corpus to encode knowledge, and by learning from information presented directly in their context, often through examples provided in the prompt.

Topics

Artificial Intelligence Neuroscience & the Brain AI & Machine Learning Technology & Innovation Science & Mathematics In-context Learning Language Models Cognitive Science Episodic Memory Parameter Learning Retrieval Augmentation

Mentioned in this video

Concepts

Reversal Curse

A phenomenon where language models struggle to reverse learned relations, a key inspiration for the presented research.

Chinchilla

Refers to Chinchilla optimal language model scaling, a concept related to training budget and model size.

People

Andrew Lampin

Member of technical staff at Anthropic, previously a staff research scientist at Google DeepMind, with a PhD in cognitive psychology from Stanford. His research bridges AI and cognitive science.

Owen Evans

Associated with the research group that published the 'Reversal Curse' paper.

Tanya Lombroso

Author of the paper 'Learning by Thinking in Natural and Artificial Minds'.

James Whittington

Researcher who has done work on the analogy between Transformer attention and hippocampal memory.

Companies

Anthropic

Company where Andrew Lampin is a member of technical staff.

Organizations

Google DeepMind

Company where Andrew Lampin previously worked as a staff research scientist.

Software & Apps

Deep sequence models

A paper discussed in relation to how statistical learning can be hard to use in context.

Books

A theory of usable information under computational constraints

An older paper from Stefano Man's group at Stanford, relevant to the line of thinking about extracting implicit information.

Ask anything from this episode.

Save it, chat with it, and connect it to Claude or ChatGPT. Get cited answers from the actual content — and build your own knowledge base of every podcast and video you care about.

Get Started Free