RPT

Software / App

A related paper and technique for Reinforcement Pretraining, compared against RLP, which uses an external verifier and a sparse reward.

Mentioned in 1 video