Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning Paper • 2605.06130 • Published 7 days ago • 95
Sober-Clever/sft_reasoning-activation_7Task-End2End-GPTGen_Qwen3-1.7B-ckpt741-Industrial_EP1 Updated Feb 6
Sober-Clever/sft_reasoning-activation_7Task-End2End-GPTGen_Qwen3-1.7B-ckpt741-Industrial_EP1 Updated Feb 6
Sober-Clever/Qwen3-1.7B_base_e2e-AmazonMix3-EP2_General_reasoning-activate-ep1_RLonOffice_ckpt1000 Updated Feb 6
Sober-Clever/Qwen3-1.7B_base_e2e-AmazonMix3-EP2_General_reasoning-activate-ep1_RLonOffice_ckpt1000 Updated Feb 6
Sober-Clever/Qwen3-1.7B_base_e2e-AmazonMix3-EP2_General_reasoning-activate-ep1_RLonGames_ckpt900 Updated Feb 5
Sober-Clever/Qwen3-1.7B_base_e2e-AmazonMix3-EP2_General_reasoning-activate-ep1_RLonGames_ckpt900 Updated Feb 5
Distilled Decoding 2: One-step Sampling of Image Auto-regressive Models with Conditional Score Distillation Paper • 2510.21003 • Published Oct 23, 2025 • 8
Quantile Advantage Estimation for Entropy-Safe Reasoning Paper • 2509.22611 • Published Sep 26, 2025 • 119
view article Article Mixture of Experts Explained +4 osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq • Dec 11, 2023 • 1.13k
Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching Paper • 2412.17153 • Published Dec 22, 2024 • 39
Sober-Clever/distilbert-base-uncased-finetuned-squad-d5716d28 Question Answering • Updated Nov 20, 2023 • 5