FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling Paper • 2604.06916 • Published 4 days ago • 16