Ring Attention

Software / App

A technique employed by Gradient for training their long context models, improving GPU utilization.

Mentioned in 1 video