JN's Easy Context
Software / App
A PyTorch implementation of ring attention that worked well and was adapted by Gradient for their cluster network topology.
Mentioned in 1 video
A PyTorch implementation of ring attention that worked well and was adapted by Gradient for their cluster network topology.