SciAgentGym: Benchmarking Multi-Step Scientific Tool-use in LLM Agents Paper • 2602.12984 • Published 16 days ago • 5