GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published 7 days ago • 221
QiMeng-PRepair: Precise Code Repair via Edit-Aware Reward Optimization Paper • 2604.05963 • Published 3 days ago • 4
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published Feb 9 • 263