MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources Paper • 2509.25531 • Published Sep 29, 2025 • 8
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published Oct 9, 2025 • 36
MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources Paper • 2509.25531 • Published Sep 29, 2025 • 8
MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources Paper • 2509.25531 • Published Sep 29, 2025 • 8
Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks Paper • 2508.18672 • Published Aug 26, 2025 • 10
mSCoRe: a $M$ultilingual and Scalable Benchmark for $S$kill-based $Co$mmonsense $Re$asoning Paper • 2508.10137 • Published Aug 13, 2025 • 2
Lizard: An Efficient Linearization Framework for Large Language Models Paper • 2507.09025 • Published Jul 11, 2025 • 18
Rewriting Pre-Training Data Boosts LLM Performance in Math and Code Paper • 2505.02881 • Published May 5, 2025 • 4
ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code Paper • 2506.02314 • Published Jun 2, 2025
EmoNet-Face: An Expert-Annotated Benchmark for Synthetic Emotion Recognition Paper • 2505.20033 • Published May 26, 2025 • 4
EmoNet-Voice: A Fine-Grained, Expert-Verified Benchmark for Speech Emotion Detection Paper • 2506.09827 • Published Jun 11, 2025 • 20
Scaling Laws for Robust Comparison of Open Foundation Language-Vision Models and Datasets Paper • 2506.04598 • Published Jun 5, 2025 • 7
Building Instruction-Tuning Datasets from Human-Written Instructions with Open-Weight Large Language Models Paper • 2503.23714 • Published Mar 31, 2025 • 1
Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMs Paper • 2411.08719 • Published Nov 10, 2024 • 1
Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs Paper • 2412.14471 • Published Dec 19, 2024
Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search Paper • 2503.04412 • Published Mar 6, 2025 • 5
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 56
LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps Paper • 2412.15035 • Published Dec 19, 2024 • 4