How do diffusion models differ from other machine learning models in learning data distributions?

Diffusion models particularly excel at mapping from high-dimensional to high-dimensional spaces, especially in low-data regimes. While other models also learn distributions, diffusion's strength lies in generating complex data from noisy inputs with limited training samples compared to the data's dimensionality.

What are some diverse applications of diffusion models beyond image generation?

Diffusion models are surprisingly versatile and are used in protein folding prediction (DeepMind's Nobel Prize work), robotics (diffusion policy), weather forecasting (Gencast), generating text and code, predicting molecule binding, and even in areas like failure sampling and continuous/discrete diffusion LLMs.

How has the diffusion model framework evolved since its inception?

Early diffusion models focused on predicting the actual data or the added error. Innovations led to easier objectives like predicting velocity, as seen in flow matching, which simplifies the mathematics and engineering. Architectures also evolved from U-Nets to diffusion transformers with cross-attention.

What is flow matching and why is it considered a significant advancement?

Flow matching simplifies the diffusion process by learning a global velocity between the noisy data and the original data, allowing for a more direct path from noise to data. This significantly reduces the complexity and code required, making the process more elegant and efficient.

What are the limitations of current diffusion models regarding inference steps?

A key limitation is that you generally cannot extend the number of inference steps beyond the number used during training. If a model was trained on 100 steps, using more than 100 steps at test time can lead to nonsensical results, unlike some other generative models.

How do diffusion models relate to the concept of artificial general intelligence (AGI)?

Diffusion models incorporate two key aspects observed in biological brains: leveraging randomness for learning and thinking in concepts rather than one token at a time. While not a complete solution for AGI, they offer valuable mechanisms that align with how biological intelligence operates.

Key Moments

The ML Technique Every Founder Should Know

Q: Can you explain the basic noising and denoising process of diffusion models?

The core idea is to take a data sample (like an image) and gradually add noise until it becomes random static. The model is then trained to reverse this process, learning to 'denoise' the corrupted data step-by-step to recover the original information or generate new data from noise.

Y Combinator

Science & Technology3 min read28 min video

Jan 22, 2026|36,411 views|870|23

YC Y Combinator

Save to Pod

Key Moments

TL;DR

Diffusion models are a versatile ML framework for data distribution learning, enabling advanced AI in various fields.

Key Insights

Diffusion is a fundamental machine learning framework capable of learning data distributions across any domain given sufficient data.

It excels at mapping between high-dimensional spaces, even with limited training data.

The core diffusion process involves systematically adding noise to data and then training a model to reverse this process (denoising).

Innovations in diffusion models have focused on refining the noise schedule and optimizing the denoising objective functions (e.g., predicting error or velocity).

Flow matching offers a simplified approach by directly learning a global velocity between noisy data and clean data, significantly reducing code complexity and inference time.

Diffusion models are increasingly applied beyond image generation to areas like protein folding, robotics, weather forecasting, text generation, and code writing.

WHAT IS DIFFUSION?

Diffusion is presented as a foundational machine learning framework designed to learn probability distributions of data across any domain, provided sufficient data. Its unique strength lies in its ability to handle mappings between high-dimensional spaces, making it particularly powerful even in low-data regimes. This allows it to learn complex data patterns with relatively few training samples compared to the dimensionality of the data.

THE NOISE AND DENOISE PROCESS

At its core, the diffusion process involves taking an initial data sample (like an image) and progressively adding noise to it over several steps, creating a series of noised-up versions. The model is then trained to reverse this process, learning to denoise the data systematically. This 'denoiser' is the trained model, which is tasked with reconstructing the original data from a noisy input by learning intermediate representations during the reversal.

EVOLUTION AND KEY INNOVATIONS

The diffusion framework has evolved significantly since its early papers. Key innovations have focused on optimizing the 'noise schedule'—how noise is added—and refining the 'loss function' or objective for the denoising model. Instead of predicting the original data, models can be trained to predict the added noise, the error, or the velocity. This continuous refinement, often driven by metrics like Fréchet Inception Distance (FID), has led to more stable training and better results.

FLOW MATCHING: SIMPLICITY AND POWER

A significant advancement is 'flow matching,' which simplifies the diffusion process. Instead of learning a step-by-step reversal, flow matching learns a global 'velocity' vector that directly guides the data from a noisy state to its original distribution. This approach dramatically reduces the complexity of the code required—sometimes to just a few lines—and makes the training and inference processes more efficient by avoiding numerous intermediate steps.

BROAD APPLICABILITY ACROSS DOMAINS

Diffusion models have demonstrated remarkable versatility, extending far beyond their initial applications in image generation (like Stable Diffusion, Midjourney, Sora). They are now instrumental in protein folding (AlphaFold), robotics (diffusion policies), weather forecasting (Gencast), natural language processing (diffusion LLMs), and even code generation. This wide applicability stems from the framework's ability to model diverse data distributions.

DIFFUSION'S ROLE IN GENERAL INTELLIGENCE

Diffusion models offer insights into artificial general intelligence by incorporating randomness and handling complex, multi-step conceptualization. Unlike current LLMs that often operate token-by-token, diffusion's ability to leverage randomness and potentially model concepts leading to large outputs aligns better with biological intelligence processes. This makes diffusion a promising avenue for developing more sophisticated AI.

A FOUNDATIONAL TOOL FOR RESEARCHERS AND FOUNDERS

For founders involved in training models, diffusion procedures are highly recommended as a core component of the training loop, regardless of the specific application. For those not directly training models, it's crucial to update their understanding of diffusion models' rapidly improving capabilities. The framework's ongoing simplification and demonstrated success across diverse fields suggest it will continue to drive innovation in AI products and redefine various economic sectors.

Mentioned in This Episode

●Software & Apps

●Companies

●Studies Cited

●Concepts

●People Referenced

Common Questions

Diffusion models are a fundamental machine learning framework designed to learn probability distributions of data across any domain, provided sufficient data. They work by adding noise to data and then training a model to reverse this process, effectively denoising the data back to its original form or similar representations.