DeepSeek researchers

Person

The developers of the GRPO RL algorithm, which is used in Qwen 3's training.

Mentioned in 1 video