SWE-Protégé: Learning to Selectively Collaborate With an Expert Unlocks Small Language Models as Software Engineering Agents Paper • 2602.22124 • Published 7 days ago
view article Article Who Routes LLM Routers? RouterArena: Building the Evaluation Foundation for LLM Routing Nov 11, 2025 • 14
EXP-Bench: Can AI Conduct AI Research Experiments? Paper • 2505.24785 • Published May 30, 2025 • 24 • 3
Curie: Toward Rigorous and Automated Scientific Experimentation with AI Agents Paper • 2502.16069 • Published Feb 22, 2025 • 20
Curie: Toward Rigorous and Automated Scientific Experimentation with AI Agents Paper • 2502.16069 • Published Feb 22, 2025 • 20