From Pixels to Words -- Towards Native Vision-Language Primitives at Scale
Haiwen Diao
Paranioar
AI & ML interests
Vision-and-Language, Parameter-efficient Transfer Learning, Multi-modal Large Language Model
Recent Activity
upvoted an article 2 days ago
NEO-unify: Building Native Multimodal Unified Models End to End published an
article
2 days ago
NEO-unify: Building Native Multimodal Unified Models End to End upvoted a paper 4 days ago
UniG2U-Bench: Do Unified Models Advance Multimodal Understanding?