datasets datasets of interest bigcode/the-stack-dedup Viewer • Updated Aug 17, 2023 • 237M • 12k • 392 liwu/MNBVC Updated 11 days ago • 158k • 596 code-search-net/code_search_net Viewer • Updated Feb 23 • 4.14M • 20.3k • 326 HuggingFaceH4/ultrachat_200k Viewer • Updated Oct 16, 2024 • 515k • 42k • 680
paper reading LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 60 RL's Razor: Why Online Reinforcement Learning Forgets Less Paper • 2509.04259 • Published Sep 4, 2025 • 7
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 60
RL's Razor: Why Online Reinforcement Learning Forgets Less Paper • 2509.04259 • Published Sep 4, 2025 • 7
datasets datasets of interest bigcode/the-stack-dedup Viewer • Updated Aug 17, 2023 • 237M • 12k • 392 liwu/MNBVC Updated 11 days ago • 158k • 596 code-search-net/code_search_net Viewer • Updated Feb 23 • 4.14M • 20.3k • 326 HuggingFaceH4/ultrachat_200k Viewer • Updated Oct 16, 2024 • 515k • 42k • 680
paper reading LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 60 RL's Razor: Why Online Reinforcement Learning Forgets Less Paper • 2509.04259 • Published Sep 4, 2025 • 7
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 60
RL's Razor: Why Online Reinforcement Learning Forgets Less Paper • 2509.04259 • Published Sep 4, 2025 • 7