10 LoRA adapters + 6 datasets. Algo template SFT vs QwQ distillation on Qwen2.5-1.5B-Instruct across 4 reasoning domains.
Reasoning Degeneration Dev
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
None defined yet.