NVIDIA Cutlass
Software / AppMentioned in 1 video
A library from NVIDIA that provides primitives for efficient matrix multiplication and memory loading on GPUs, used as a base for Flash Attention 2.
A library from NVIDIA that provides primitives for efficient matrix multiplication and memory loading on GPUs, used as a base for Flash Attention 2.