Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards Paper • 2601.06021 • Published 15 days ago • 43
Auto-Rubric: Learning to Extract Generalizable Criteria for Reward Modeling Paper • 2510.17314 • Published Oct 20, 2025 • 2
InfiMed-ORBIT: Aligning LLMs on Open-Ended Complex Tasks via Rubric-Based Incremental Training Paper • 2510.15859 • Published Oct 17, 2025 • 13
Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning Paper • 2508.16949 • Published Aug 23, 2025 • 24
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published Nov 24, 2025 • 61
Self-Rewarding Rubric-Based Reinforcement Learning for Open-Ended Reasoning Paper • 2509.25534 • Published Sep 19, 2025 • 3
Chasing the Tail: Effective Rubric-based Reward Modeling for Large Language Model Post-Training Paper • 2509.21500 • Published Sep 25, 2025 • 20