Vision-Language Models

CLIP, LLaVA, GPT-4V, and visual reasoning

CLIP, LLaVA, GPT-4V, and visual reasoning