G
GSM 8K
Study / ResearchMentioned in 1 video
A benchmark for AI mathematical reasoning designed for high schoolers, found to have errors in its original design.
A benchmark for AI mathematical reasoning designed for high schoolers, found to have errors in its original design.