Emu3.5: Native Multimodal Models are World Learners Paper • 2510.26583 • Published Oct 30, 2025 • 108
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published Oct 13, 2025 • 165
google/siglip2-so400m-patch16-naflex Zero-Shot Image Classification • 1B • Updated Feb 21, 2025 • 1.21M • 49
X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again Paper • 2507.22058 • Published Jul 29, 2025 • 39
Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models Paper • 2507.07104 • Published Jul 9, 2025 • 45