Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Tasks Paper • 2604.20987 • Published 5 days ago • 20
SWE-chat: Coding Agent Interactions From Real Users in the Wild Paper • 2604.20779 • Published 5 days ago • 11
SkillLearnBench: Benchmarking Continual Learning Methods for Agent Skill Generation on Real-World Tasks Paper • 2604.20087 • Published 5 days ago • 14
Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents Paper • 2604.14004 • Published 12 days ago • 29
LEGO-Eval: Towards Fine-Grained Evaluation on Synthesizing 3D Embodied Environments with Tool Augmentation Paper • 2511.03001 • Published Nov 4, 2025 • 49
Safe and Scalable Web Agent Learning via Recreated Websites Paper • 2603.10505 • Published Mar 11 • 27
Embodied Agents Meet Personalization: Exploring Memory Utilization for Personalized Assistance Paper • 2505.16348 • Published May 22, 2025 • 52
Web-Shepherd: Advancing PRMs for Reinforcing Web Agents Paper • 2505.15277 • Published May 21, 2025 • 105