JN's Easy Context

Software / AppMentioned in 1 video

A PyTorch implementation of ring attention that worked well and was adapted by Gradient for their cluster network topology.