When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models Paper • 2604.08546 • Published 2 days ago • 104
Focal Guidance: Unlocking Controllability from Semantic-Weak Layers in Video Diffusion Models Paper • 2601.07287 • Published Jan 12 • 6
SAMA: Factorized Semantic Anchoring and Motion Alignment for Instruction-Guided Video Editing Paper • 2603.19228 • Published 22 days ago • 68