Reward Feedback Learning

Concept

A preference tuning method that trains a reward model from human preferences (pair-wise or list-wise ratings) and then tunes the image generation model to produce images with higher rewards.

Mentioned in 1 video