AgentOhana: Design Unified Data and Training Pipeline for Effective
Agent Learning
Paper
• 2402.15506
• Published • 17
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web
Navigating Agent
Paper
• 2404.03648
• Published • 29
Similarity is Not All You Need: Endowing Retrieval Augmented Generation
with Multi Layered Thoughts
Paper
• 2405.19893
• Published • 33
Parrot: Efficient Serving of LLM-based Applications with Semantic
Variable
Paper
• 2405.19888
• Published • 7
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective
Navigation via Multi-Agent Collaboration
Paper
• 2406.01014
• Published • 33
AgentGym: Evolving Large Language Model-based Agents across Diverse
Environments
Paper
• 2406.04151
• Published • 24
τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World
Domains
Paper
• 2406.12045
• Published • 9
Agentless: Demystifying LLM-based Software Engineering Agents
Paper
• 2407.01489
• Published • 65
Internet of Agents: Weaving a Web of Heterogeneous Agents for
Collaborative Intelligence
Paper
• 2407.07061
• Published • 28
Spider2-V: How Far Are Multimodal Agents From Automating Data Science
and Engineering Workflows?
Paper
• 2407.10956
• Published • 7
Sibyl: Simple yet Effective Agent Framework for Complex Real-world
Reasoning
Paper
• 2407.10718
• Published • 19
POGEMA: A Benchmark Platform for Cooperative Multi-Agent Navigation
Paper
• 2407.14931
• Published • 22
AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?
Paper
• 2407.15711
• Published • 9
CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis
Paper
• 2407.13301
• Published • 55
OpenDevin: An Open Platform for AI Software Developers as Generalist
Agents
Paper
• 2407.16741
• Published • 77
LAMBDA: A Large Model Based Data Agent
Paper
• 2407.17535
• Published • 37
AppWorld: A Controllable World of Apps and People for Benchmarking
Interactive Coding Agents
Paper
• 2407.18901
• Published • 35
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
Paper
• 2407.20183
• Published • 43
GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS
Paper
• 2408.01584
• Published • 10
Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in
Long-Horizon Tasks
Paper
• 2408.03615
• Published • 31
CodexGraph: Bridging Large Language Models and Code Repositories via
Code Graph Databases
Paper
• 2408.03910
• Published • 18
Automated Design of Agentic Systems
Paper
• 2408.08435
• Published • 40
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized
Academic Assistance
Paper
• 2409.04593
• Published • 26
Paper
• 2409.07429
• Published • 32
SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research
Repositories
Paper
• 2409.07440
• Published • 8
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks
at Scale
Paper
• 2409.16299
• Published • 11
MSI-Agent: Incorporating Multi-Scale Insight into Embodied Agents for
Superior Planning and Decision-Making
Paper
• 2409.16686
• Published • 10
Tutor CoPilot: A Human-AI Approach for Scaling Real-Time Expertise
Paper
• 2410.03017
• Published • 29
Agent S: An Open Agentic Framework that Uses Computers Like a Human
Paper
• 2410.08164
• Published • 26
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language
Models
Paper
• 2410.11710
• Published • 20
Agent-as-a-Judge: Evaluate Agents with Agents
Paper
• 2410.10934
• Published • 23
Revealing the Barriers of Language Agents in Planning
Paper
• 2410.12409
• Published • 27
MobA: A Two-Level Agent System for Efficient Mobile Task Automation
Paper
• 2410.13757
• Published • 32
Web Agents with World Models: Learning and Leveraging Environment
Dynamics in Web Navigation
Paper
• 2410.13232
• Published • 44
AgentStore: Scalable Integration of Heterogeneous Agents As Specialized
Generalist Computer Assistant
Paper
• 2410.18603
• Published • 32
AutoKaggle: A Multi-Agent Framework for Autonomous Data Science
Competitions
Paper
• 2410.20424
• Published • 40
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World
Exploration, Feedback and Optimization
Paper
• 2410.19609
• Published • 18
Teaching Embodied Reinforcement Learning Agents: Informativeness and
Diversity of Language Use
Paper
• 2410.24218
• Published • 6
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Paper
• 2410.23218
• Published • 49
Adapting While Learning: Grounding LLMs for Scientific Problems with
Intelligent Tool Usage Adaptation
Paper
• 2411.00412
• Published • 10
AndroidLab: Training and Systematic Benchmarking of Android Autonomous
Agents
Paper
• 2410.24024
• Published • 49
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum
Reinforcement Learning
Paper
• 2411.02337
• Published • 36
Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large
Language Model
Paper
• 2411.04496
• Published • 22
GazeGen: Gaze-Driven User Interaction for Visual Content Generation
Paper
• 2411.04335
• Published • 15
The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer
Use
Paper
• 2411.10323
• Published • 34
Is Your LLM Secretly a World Model of the Internet? Model-Based Planning
for Web Agents
Paper
• 2411.06559
• Published • 16
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
Paper
• 2411.13543
• Published • 19
SketchAgent: Language-Driven Sequential Sketch Generation
Paper
• 2411.17673
• Published • 18
Interleaved Scene Graph for Interleaved Text-and-Image Generation
Assessment
Paper
• 2411.17188
• Published • 20
Large Language Model-Brained GUI Agents: A Survey
Paper
• 2411.18279
• Published • 30
MALT: Improving Reasoning with Multi-Agent LLM Training
Paper
• 2412.01928
• Published • 46
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
Paper
• 2412.04454
• Published • 71
Unraveling the Complexity of Memory in RL Agents: an Approach for
Classification and Evaluation
Paper
• 2412.06531
• Published • 72
The BrowserGym Ecosystem for Web Agent Research
Paper
• 2412.05467
• Published • 24
AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web
Tutorials
Paper
• 2412.09605
• Published • 30
Large Action Models: From Inception to Implementation
Paper
• 2412.10047
• Published • 36
Evaluation Agent: Efficient and Promptable Evaluation Framework for
Visual Generative Models
Paper
• 2412.09645
• Published • 36
Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation
Model Internet Agents
Paper
• 2412.13194
• Published • 12
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World
Tasks
Paper
• 2412.14161
• Published • 51
Paper
• 2412.13501
• Published • 30
PC Agent: While You Sleep, AI Works -- A Cognitive Journey into Digital
World
Paper
• 2412.17589
• Published • 14
Agent-SafetyBench: Evaluating the Safety of LLM Agents
Paper
• 2412.14470
• Published • 13
Training Software Engineering Agents and Verifiers with SWE-Gym
Paper
• 2412.21139
• Published • 26
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse
Task Synthesis
Paper
• 2412.19723
• Published • 87
A3: Android Agent Arena for Mobile GUI Agents
Paper
• 2501.01149
• Published • 22
Agent Laboratory: Using LLM Agents as Research Assistants
Paper
• 2501.04227
• Published • 95
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Paper
• 2501.05366
• Published • 103
InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning
and Reflection
Paper
• 2501.04575
• Published • 25
PaSa: An LLM Agent for Comprehensive Academic Paper Search
Paper
• 2501.10120
• Published • 55
Agent-R: Training Language Model Agents to Reflect via Iterative
Self-Training
Paper
• 2501.11425
• Published • 109
UI-TARS: Pioneering Automated GUI Interaction with Native Agents
Paper
• 2501.12326
• Published • 64
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks
Paper
• 2501.11733
• Published • 28
FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in
Virtual 3D Spaces
Paper
• 2501.12909
• Published • 74
IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI
Systems
Paper
• 2501.11067
• Published • 13
CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web
Navigation
Paper
• 2501.16609
• Published • 7
QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search
Paper
• 2502.02584
• Published • 16
Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models
Beneficial?
Paper
• 2502.00674
• Published • 13
MetaChain: A Fully-Automated and Zero-Code Framework for LLM Agents
Paper
• 2502.05957
• Published • 15
InSTA: Towards Internet-Scale Training For Agents
Paper
• 2502.06776
• Published • 9
Hephaestus: Improving Fundamental Agent Capabilities of Large Language
Models through Continual Pre-Training
Paper
• 2502.06589
• Published • 21
EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language
Models for Vision-Driven Embodied Agents
Paper
• 2502.09560
• Published • 35
OctoTools: An Agentic Framework with Extensible Tools for Complex
Reasoning
Paper
• 2502.11271
• Published • 18
Autellix: An Efficient Serving Engine for LLM Agents as General Programs
Paper
• 2502.13965
• Published • 19
TAG: A Decentralized Framework for Multi-Agent Hierarchical
Reinforcement Learning
Paper
• 2502.15425
• Published • 9
Self-Taught Agentic Long Context Understanding
Paper
• 2502.15920
• Published • 3
WebGames: Challenging General-Purpose Web-Browsing AI Agents
Paper
• 2502.18356
• Published • 14
ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic
Iterative Reasoning Agents
Paper
• 2502.18017
• Published • 21
PodAgent: A Comprehensive Framework for Podcast Generation
Paper
• 2503.00455
• Published • 6
MPO: Boosting LLM Agents with Meta Plan Optimization
Paper
• 2503.02682
• Published • 29
Agent models: Internalizing Chain-of-Action Generation into Reasoning
models
Paper
• 2503.06580
• Published • 20
API Agents vs. GUI Agents: Divergence and Convergence
Paper
• 2503.11069
• Published • 36
STEVE: AStep Verification Pipeline for Computer-use Agent Training
Paper
• 2503.12532
• Published • 17
Survey on Evaluation of LLM-based Agents
Paper
• 2503.16416
• Published • 96
Verbal Process Supervision Elicits Better Coding Agents
Paper
• 2503.18494
• Published • 2
Large Language Model Agent: A Survey on Methodology, Applications and
Challenges
Paper
• 2503.21460
• Published • 83
UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement
Learning
Paper
• 2503.21620
• Published • 62
Classical Planning with LLM-Generated Heuristics: Challenging the State
of the Art with Python Code
Paper
• 2503.18809
• Published • 9
Agent S2: A Compositional Generalist-Specialist Framework for Computer
Use Agents
Paper
• 2504.00906
• Published • 27
Advances and Challenges in Foundation Agents: From Brain-Inspired
Intelligence to Evolutionary, Collaborative, and Safe Systems
Paper
• 2504.01990
• Published • 305
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent
Trajectories
Paper
• 2504.08942
• Published • 28
Breaking the Data Barrier -- Building GUI Agents Through Task
Generalization
Paper
• 2504.10127
• Published • 17
SocioVerse: A World Model for Social Simulation Powered by LLM Agents
and A Pool of 10 Million Real-World Users
Paper
• 2504.10157
• Published • 17
The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via
Agentic Tree Search
Paper
• 2504.08066
• Published • 16
Paper
• 2504.11442
• Published • 30
MLRC-Bench: Can Language Agents Solve Machine Learning Research
Challenges?
Paper
• 2504.09702
• Published • 18
Exploring Expert Failures Improves LLM Agent Tuning
Paper
• 2504.13145
• Published • 12
UFO2: The Desktop AgentOS
Paper
• 2504.14603
• Published • 29
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to
Deliberative Reasoners
Paper
• 2504.14239
• Published • 14
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making
Abilities
Paper
• 2504.16078
• Published • 21
Paper2Code: Automating Code Generation from Scientific Papers in Machine
Learning
Paper
• 2504.17192
• Published • 124
LLM-Powered GUI Agents in Phone Automation: Surveying Progress and
Prospects
Paper
• 2504.19838
• Published • 23
Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory
Paper
• 2504.19413
• Published • 48
RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn
Reinforcement Learning
Paper
• 2504.20073
• Published • 12
Agentic Reasoning and Tool Integration for LLMs via Reinforcement
Learning
Paper
• 2505.01441
• Published • 39
Think on your Feet: Adaptive Thinking via Reinforcement Learning for
Social Agents
Paper
• 2505.02156
• Published • 18
Multi-Agent System for Comprehensive Soccer Understanding
Paper
• 2505.03735
• Published • 25
OSUniverse: Benchmark for Multimodal GUI-navigation AI Agents
Paper
• 2505.03570
• Published • 8
LLM-Independent Adaptive RAG: Let the Question Speak for Itself
Paper
• 2505.04253
• Published • 14
AI Agents vs. Agentic AI: A Conceptual Taxonomy, Applications and
Challenge
Paper
• 2505.10468
• Published • 10
Creating General User Models from Computer Use
Paper
• 2505.10831
• Published • 5
Visual Agentic Reinforcement Fine-Tuning
Paper
• 2505.14246
• Published • 32
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop
System from Hypothesis to Verification
Paper
• 2505.16938
• Published • 121
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Paper
• 2505.17612
• Published • 81
UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based
Mobile GUI Agents
Paper
• 2505.21496
• Published • 38
WebDancer: Towards Autonomous Information Seeking Agency
Paper
• 2505.22648
• Published • 33
Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and
Benchmarking Multimodal LLM Agents
Paper
• 2505.24878
• Published • 23
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
Paper
• 2506.03143
• Published • 53
TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management
in LLM-based Agentic Multi-Agent Systems
Paper
• 2506.04133
• Published • 3
ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow
Development
Paper
• 2506.05010
• Published • 80
Surfer-H Meets Holo1: Cost-Efficient Web Agent Powered by Open Weights
Paper
• 2506.02865
• Published • 33
MedAgentGym: Training LLM Agents for Code-Based Medical Reasoning at
Scale
Paper
• 2506.04405
• Published • 7
Agents of Change: Self-Evolving LLM Agents for Strategic Planning
Paper
• 2506.04651
• Published • 8
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper
• 2506.11763
• Published • 74
Scaling Test-time Compute for LLM Agents
Paper
• 2506.12928
• Published • 63
OAgents: An Empirical Study of Building Effective Agents
Paper
• 2506.15741
• Published • 35
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via
Multi-Agent Multi-Turn Reinforcement Learning
Paper
• 2506.24119
• Published • 51
WebSailor: Navigating Super-human Reasoning for Web Agent
Paper
• 2507.02592
• Published • 126
PresentAgent: Multimodal Agent for Presentation Video Generation
Paper
• 2507.04036
• Published • 11
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving
Paper
• 2507.06229
• Published • 76
MIRIX: Multi-Agent Memory System for LLM-Based Agents
Paper
• 2507.07957
• Published • 80
GUI-G^2: Gaussian Reward Modeling for GUI Grounding
Paper
• 2507.15846
• Published • 135
MCPEval: Automatic MCP-based Deep Evaluation for AI Agent Models
Paper
• 2507.12806
• Published • 21
LLM Economist: Large Population Models and Mechanism Design in
Multi-Agent Generative Simulacra
Paper
• 2507.15815
• Published • 7
MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI
Agents
Paper
• 2507.19478
• Published • 33
A Survey of Self-Evolving Agents: On Path to Artificial Super
Intelligence
Paper
• 2507.21046
• Published • 85
GenoMAS: A Multi-Agent Framework for Scientific Discovery via
Code-Driven Gene Expression Analysis
Paper
• 2507.21035
• Published • 3
ScreenCoder: Advancing Visual-to-Code Generation for Front-End
Automation via Modular Multimodal Agents
Paper
• 2507.22827
• Published • 101
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent
Foundation Models Training
Paper
• 2508.00414
• Published • 94
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution
Paper
• 2507.23348
• Published • 12
CellForge: Agentic Design of Virtual Cell Models
Paper
• 2508.02276
• Published • 39
RoboMemory: A Brain-inspired Multi-memory Agentic Framework for Lifelong
Learning in Physical Embodied Systems
Paper
• 2508.01415
• Published • 8
AgentTTS: Large Language Model Agent for Test-time Compute-optimal
Scaling Strategy in Complex Tasks
Paper
• 2508.00890
• Published • 7
LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?
Paper
• 2508.01780
• Published • 21
HyCodePolicy: Hybrid Language Controllers for Multimodal Monitoring and
Decision in Embodied Agents
Paper
• 2508.02629
• Published • 6
Efficient Agents: Building Effective Agents While Reducing Cost
Paper
• 2508.02694
• Published • 86
SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from
Experience
Paper
• 2508.04700
• Published • 52
Training Long-Context, Multi-Turn Software Engineering Agents with
Reinforcement Learning
Paper
• 2508.03501
• Published • 59
Enhancing Vision-Language Model Training with Reinforcement Learning in
Synthetic Worlds for Real-World Success
Paper
• 2508.04280
• Published • 35
Agent Lightning: Train ANY AI Agents with Reinforcement Learning
Paper
• 2508.03680
• Published • 138
Web-CogReasoner: Towards Knowledge-Induced Cognitive Reasoning for Web
Agents
Paper
• 2508.01858
• Published • 20
CoAct-1: Computer-using Agents with Coding as Actions
Paper
• 2508.03923
• Published • 13
OS Agents: A Survey on MLLM-based Agents for General Computing Devices
Use
Paper
• 2508.04482
• Published • 9
WideSearch: Benchmarking Agentic Broad Info-Seeking
Paper
• 2508.07999
• Published • 111
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm
Bridging Foundation Models and Lifelong Agentic Systems
Paper
• 2508.07407
• Published • 99
BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of
Deep-Research Agent
Paper
• 2508.06600
• Published • 41
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper
• 2508.05748
• Published • 142
Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale
Asynchronous RL
Paper
• 2508.07976
• Published • 52
OpenCUA: Open Foundations for Computer-Use Agents
Paper
• 2508.09123
• Published • 33
Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with
Long-Term Memory
Paper
• 2508.09736
• Published • 58
AWorld: Dynamic Multi-Agent System with Stable Maneuvering for Robust
GAIA Problem Solving
Paper
• 2508.09889
• Published • 32
UI-Venus Technical Report: Building High-performance UI Agents with RFT
Paper
• 2508.10833
• Published • 45
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent
Distillation and Agentic RL
Paper
• 2508.13167
• Published • 129
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents
Paper
• 2508.13186
• Published • 19
CAMAR: Continuous Actions Multi-Agent Routing
Paper
• 2508.12845
• Published • 7
Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic
Thought Reward
Paper
• 2508.12800
• Published • 6
MCP-Universe: Benchmarking Large Language Models with Real-World Model
Context Protocol Servers
Paper
• 2508.14704
• Published • 43
Mobile-Agent-v3: Foundamental Agents for GUI Automation
Paper
• 2508.15144
• Published • 65
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper
• 2508.16153
• Published • 162
PosterGen: Aesthetic-Aware Paper-to-Poster Generation via Multi-Agent
LLMs
Paper
• 2508.17188
• Published • 17
Training Language Model Agents to Find Vulnerabilities with CTF-Dojo
Paper
• 2508.18370
• Published • 3
ReportBench: Evaluating Deep Research Agents via Academic Survey Tasks
Paper
• 2508.15804
• Published • 15
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World
Tasks via MCP Servers
Paper
• 2508.20453
• Published • 63
AWorld: Orchestrating the Training Recipe for Agentic AI
Paper
• 2508.20404
• Published • 38
UItron: Foundational GUI Agent with Advanced Perception and Planning
Paper
• 2508.21767
• Published • 12
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper
• 2509.02547
• Published • 235
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn
Reinforcement Learning
Paper
• 2509.02544
• Published • 127
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making
through Multi-Turn Reinforcement Learning
Paper
• 2509.08755
• Published • 57
MCP-AgentBench: Evaluating Real-World Language Agent Performance with
MCP-Mediated Tools
Paper
• 2509.09734
• Published • 16
QuantAgent: Price-Driven Multi-Agent LLMs for High-Frequency Trading
Paper
• 2509.09995
• Published • 16
WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for
Open-Ended Deep Research
Paper
• 2509.13312
• Published • 106
Scaling Agents via Continual Pre-training
Paper
• 2509.13310
• Published • 117
WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic
Data and Scalable Reinforcement Learning
Paper
• 2509.13305
• Published • 91
Towards General Agentic Intelligence via Environment Scaling
Paper
• 2509.13311
• Published • 72
WebResearcher: Unleashing unbounded reasoning capability in Long-Horizon
Agents
Paper
• 2509.13309
• Published • 67
ReSum: Unlocking Long-Horizon Search Intelligence via Context
Summarization
Paper
• 2509.13313
• Published • 80
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform
Data
Paper
• 2509.15221
• Published • 111
Towards Human-like Multimodal Conversational Agent by Generating
Engaging Speech
Paper
• 2509.14627
• Published • 1
LIMI: Less is More for Agency
Paper
• 2509.17567
• Published • 104
ARE: Scaling Up Agent Environments and Evaluations
Paper
• 2509.17158
• Published • 36
SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering
Tasks?
Paper
• 2509.16941
• Published • 21
Paper
• 2509.17336
• Published • 10
GEM: A Gym for Agentic LLMs
Paper
• 2510.01051
• Published • 91
Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel
Execution
Paper
• 2509.25301
• Published • 20
JoyAgent-JDGenie: Technical Report on the GAIA
Paper
• 2510.00510
• Published • 4
Multi-Agent Tool-Integrated Policy Optimization
Paper
• 2510.04678
• Published • 31
Don't Just Fine-tune the Agent, Tune the Environment
Paper
• 2510.10197
• Published • 30
AlphaQuanter: An End-to-End Tool-Orchestrated Agentic Reinforcement
Learning Framework for Stock Trading
Paper
• 2510.14264
• Published • 10
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science
Paper
• 2510.16872
• Published • 112
DeepAgent: A General Reasoning Agent with Scalable Toolsets
Paper
• 2510.21618
• Published • 102
Tongyi DeepResearch Technical Report
Paper
• 2510.24701
• Published • 103
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds
Paper
• 2511.08892
• Published • 214
HaluMem: Evaluating Hallucinations in Memory Systems of Agents
Paper
• 2511.03506
• Published • 95
UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist
Paper
• 2511.08521
• Published • 39
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning
Paper
• 2511.14460
• Published • 21
General Agentic Memory Via Deep Research
Paper
• 2511.18423
• Published • 170
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO
Paper
• 2511.13288
• Published • 19
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence
Paper
• 2511.18538
• Published • 302
Deep Research: A Systematic Survey
Paper
• 2512.02038
• Published • 73
PosterCopilot: Toward Layout Reasoning and Controllable Editing for Professional Graphic Design
Paper
• 2512.04082
• Published • 14
Step-GUI Technical Report
Paper
• 2512.15431
• Published • 133
Memory in the Age of AI Agents
Paper
• 2512.13564
• Published • 155
Paper
• 2512.16301
• Published • 108
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models
Paper
• 2512.24618
• Published • 152
MAI-UI Technical Report: Real-World Centric Foundation GUI Agents
Paper
• 2512.22047
• Published • 30
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization
Paper
• 2512.24615
• Published • 119
Agentic Reasoning for Large Language Models
Paper
• 2601.12538
• Published • 202
Kimi K2.5: Visual Agentic Intelligence
Paper
• 2602.02276
• Published • 259