The single-transformer approach that handles depth, motion, and camera pose together.
Two Minute Papers