Generative Pre-training for Speech with Flow Matching Paper • 2310.16338 • Published Oct 25, 2023 • 1
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention Paper • 2006.16236 • Published Jun 29, 2020 • 4
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale Paper • 2306.15687 • Published Jun 23, 2023
Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound Paper • 2502.05139 • Published Feb 7, 2025 • 2
Pushing the Frontier of Audiovisual Perception with Large-Scale Multimodal Correspondence Learning Paper • 2512.19687 • Published 10 days ago • 1