DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off Paper • 2604.13902 • Published 9 days ago • 58
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published 21 days ago • 367
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published 22 days ago • 487
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published 25 days ago • 340
Learning to Commit: Generating Organic Pull Requests via Online Repository Memory Paper • 2603.26664 • Published 27 days ago • 9