Ruler Suite

Software / App

A more comprehensive set of benchmarks for evaluating long context models, including multi-needle retrieval, variable tracking, and summary statistics.

Mentioned in 1 video

Videos Mentioning Ruler Suite

How to train a Million Context LLM — with Mark Huang of Gradient.ai

Latent Space

A more comprehensive set of benchmarks for evaluating long context models, including multi-needle retrieval, variable tracking, and summary statistics.