Reward Feedback Learning
Concept
A preference tuning method that trains a reward model from human preferences (pair-wise or list-wise ratings) and then tunes the image generation model to produce images with higher rewards.
Mentioned in 1 video
A preference tuning method that trains a reward model from human preferences (pair-wise or list-wise ratings) and then tunes the image generation model to produce images with higher rewards.