LLaVA
Software / App
A family of vision-language models that inject image embeddings into a language model; LLaVA 1.5 can handle multiple images and videos.
Mentioned in 1 video
A family of vision-language models that inject image embeddings into a language model; LLaVA 1.5 can handle multiple images and videos.