Group Relative Policy Optimization (GRPO)
Concept
A novel training method developed by DeepSeek to improve model efficiency and accuracy, also used in DeepSeek R1.
Mentioned in 1 video
A novel training method developed by DeepSeek to improve model efficiency and accuracy, also used in DeepSeek R1.