Reward Feedback Learning

Concept

A preference tuning method that trains a reward model from human preferences (pair-wise or list-wise ratings) and then tunes the image generation model to produce images with higher rewards.

Mentioned in 1 video

Videos Mentioning Reward Feedback Learning

Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 6 - Model Training

Stanford Online

A preference tuning method that trains a reward model from human preferences (pair-wise or list-wise ratings) and then tunes the image generation model to produce images with higher rewards.