Flow Group Reward Policy Optimization

Concept

An algorithm adapted from the LLM field for image generation, aiming to increase diversity of generated images for a given prompt and using relative rewards to update the model policy.

Mentioned in 1 video