Distributed Data Parallel
Concept
A PyTorch implementation of data parallelism where gradients are averaged across all processes after the backward pass.
Mentioned in 1 video
A PyTorch implementation of data parallelism where gradients are averaged across all processes after the backward pass.