Vija
Software / App
A video JEPA model that uses temporal masking on consecutive frames to predict unseen areas, employing an EMA encoder and stop gradient for regularization.
Mentioned in 1 video
A video JEPA model that uses temporal masking on consecutive frames to predict unseen areas, employing an EMA encoder and stop gradient for regularization.