PixelPrune: Pixel-Level Adaptive Visual Token Reduction via Predictive Coding Paper • 2604.00886 • Published 5 days ago • 5
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model Paper • 2603.21986 • Published 14 days ago • 120
AVControl: Efficient Framework for Training Audio-Visual Controls Paper • 2603.24793 • Published 11 days ago • 26
Less Gaussians, Texture More: 4K Feed-Forward Textured Splatting Paper • 2603.25745 • Published 11 days ago • 14
4DGS360: 360° Gaussian Reconstruction of Dynamic Objects from a Single Video Paper • 2603.21618 • Published 14 days ago • 15
Grounding World Simulation Models in a Real-World Metropolis Paper • 2603.15583 • Published 21 days ago • 153