JN's Easy Context
Software / AppMentioned in 1 video
A PyTorch implementation of ring attention that worked well and was adapted by Gradient for their cluster network topology.
A PyTorch implementation of ring attention that worked well and was adapted by Gradient for their cluster network topology.