MMDIT
Software / App
A diffusion model where the input is guided by text, not directly reusable as is for image-to-text evaluation tasks.
Mentioned in 1 video
A diffusion model where the input is guided by text, not directly reusable as is for image-to-text evaluation tasks.