Papers
arxiv:2604.08377

SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

Published on Apr 9
· Submitted by
taesiri
on Apr 10
#2 Paper of the day
Authors:
,
,
,
,
,
,
,

Abstract

SkillClaw enables collective skill evolution in multi-user LLM agent systems by aggregating user interactions to autonomously update and improve reusable skills across the ecosystem.

AI-generated summary

Large language model (LLM) agents such as OpenClaw rely on reusable skills to perform complex tasks, yet these skills remain largely static after deployment. As a result, similar workflows, tool usage patterns, and failure modes are repeatedly rediscovered across users, preventing the system from improving with experience. While interactions from different users provide complementary signals about when a skill works or fails, existing systems lack a mechanism to convert such heterogeneous experiences into reliable skill updates. To address these issues, we present SkillClaw, a framework for collective skill evolution in multi-user agent ecosystems, which treats cross-user and over-time interactions as the primary signal for improving skills. SkillClaw continuously aggregates trajectories generated during use and processes them with an autonomous evolver, which identifies recurring behavioral patterns and translates them into updates to the skill set by refining existing skills or extending them with new capabilities. The resulting skills are maintained in a shared repository and synchronized across users, allowing improvements discovered in one context to propagate system-wide while requiring no additional effort from users. By integrating multi-user experience into ongoing skill updates, SkillClaw enables cross-user knowledge transfer and cumulative capability improvement, and experiments on WildClawBench show that limited interaction and feedback, it significantly improves the performance of Qwen3-Max in real-world agent scenarios.

Community

Thanks for sharing our work.
The code is released on https://github.com/AMAP-ML/SkillClaw

found a good walkthrough of this paper here https://arxivexplained.com/skillclaw-let-skills-evolve-collectively-with-agentic-evolver the part about SkillClaw was the most interesting bit to me

·

the most interesting detail here is how SkillClaw clusters cross-user trajectories into referenced skills and then uses the evolver to translate those patterns into concrete updates. i’m curious about how you decide the granularity of the skill taxonomy, and what happens if two different workflows land in the same cluster or if a single workflow touches multiple skills. btw the arXivLens breakdown helped me parse the method details, especially the nighttime validation gate that defers updates until a safe rollout. a potential pitfall is drift when the taxonomy is too coarse or when noisy signals from users push updates that barely generalize, so i’d love to see an ablation on taxonomy granularity and tests on non-stationary user distributions.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2604.08377 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2604.08377 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2604.08377 in a Space README.md to link it from this page.

Collections including this paper 6