Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Text Generation • 28B • Updated 4 days ago • 2.93k • 49
ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning Paper • 2602.21534 • Published 7 days ago • 23
PyVision-RL: Forging Open Agentic Vision Models via RL Paper • 2602.20739 • Published 8 days ago • 29
EgoPush: Learning End-to-End Egocentric Multi-Object Rearrangement for Mobile Robots Paper • 2602.18071 • Published 12 days ago • 22
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 12 days ago • 474
DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories Paper • 2602.10809 • Published 21 days ago • 55
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models Paper • 2602.12036 • Published 20 days ago • 93
Internalizing Meta-Experience into Memory for Guided Reinforcement Learning in Large Language Models Paper • 2602.10224 • Published 21 days ago • 19
view article Article The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+ 29 days ago • 51
The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation Paper • 2601.17737 • Published Jan 25 • 55
view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective Jan 27 • 61