MARS: Enabling Autoregressive Models Multi-Token Generation Paper • 2604.07023 • Published 7 days ago • 36
FastVLM Collection Efficient Vision Encoding for Vision Language Models • 8 items • Updated Mar 2 • 111
OmniVoice: Towards Omnilingual Zero-Shot Text-to-Speech with Diffusion Language Models Paper • 2604.00688 • Published 13 days ago • 9