GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent Paper • 2603.13875 • Published 16 days ago • 33
Running on CPU Upgrade 209 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 209 Explore synthetic data experiments as an interactive bookshelf
SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale Paper • 2602.23866 • Published about 1 month ago • 88
view article Article Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek Jan 27 • 45
view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels Aug 18, 2025 • 95
unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF Text Generation • 31B • Updated Jan 30 • 135k • 558
Running 3.76k The Ultra-Scale Playbook 🌌 3.76k The ultimate guide to training LLM on large GPU Clusters