reasoning llm
updated
Reasoning Introduces New Poisoning Attacks Yet Makes Them More
Complicated
Paper
•
2509.05739
•
Published
•
2
Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers
Paper
•
2509.03059
•
Published
•
24
Universal Deep Research: Bring Your Own Model and Strategy
Paper
•
2509.00244
•
Published
•
13
<think> So let's replace this phrase with insult... </think> Lessons
learned from generation of toxic texts with LLMs
Paper
•
2509.08358
•
Published
•
13
BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale
Pretraining
Paper
•
2508.10975
•
Published
•
60
A Survey on Latent Reasoning
Paper
•
2507.06203
•
Published
•
93
Does Math Reasoning Improve General LLM Capabilities? Understanding
Transferability of LLM Reasoning
Paper
•
2507.00432
•
Published
•
79
MegaScience: Pushing the Frontiers of Post-Training Datasets for Science
Reasoning
Paper
•
2507.16812
•
Published
•
63
Generative AI Act II: Test Time Scaling Drives Cognition Engineering
Paper
•
2504.13828
•
Published
•
18
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical
Reasoning
Paper
•
2506.09513
•
Published
•
101
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data
Processing to Every Language
Paper
•
2506.20920
•
Published
•
75
Large Language Models for Data Synthesis
Paper
•
2505.14752
•
Published
•
49
OpenThoughts: Data Recipes for Reasoning Models
Paper
•
2506.04178
•
Published
•
50
Skywork Open Reasoner 1 Technical Report
Paper
•
2505.22312
•
Published
•
54
NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions
Paper
•
2502.13124
•
Published
•
6
OpenCodeReasoning: Advancing Data Distillation for Competitive Coding
Paper
•
2504.01943
•
Published
•
15
OpenCodeReasoning-II: A Simple Test Time Scaling Approach via
Self-Critique
Paper
•
2507.09075
•
Published
•
15
OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling
Paper
•
2506.20512
•
Published
•
48
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly
Licensed Text
Paper
•
2506.05209
•
Published
•
59
Essential-Web v1.0: 24T tokens of organized web data
Paper
•
2506.14111
•
Published
•
46
HardTests: Synthesizing High-Quality Test Cases for LLM Coding
Paper
•
2505.24098
•
Published
•
43
Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just
Like an Olympiad Team
Paper
•
2506.14234
•
Published
•
41
SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning
Logical Reasoning and Beyond
Paper
•
2505.19641
•
Published
•
68
WebThinker: Empowering Large Reasoning Models with Deep Research
Capability
Paper
•
2504.21776
•
Published
•
59
Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM
Reasoning
Paper
•
2505.17813
•
Published
•
58
Thinkless: LLM Learns When to Think
Paper
•
2505.13379
•
Published
•
50
Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language
Models in Math
Paper
•
2504.21233
•
Published
•
49
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for
Alignment with Human Values
Paper
•
2504.05535
•
Published
•
44
MegaMath: Pushing the Limits of Open Math Corpora
Paper
•
2504.02807
•
Published
•
35
Ultra-FineWeb: Efficient Data Filtering and Verification for
High-Quality LLM Training Data
Paper
•
2505.05427
•
Published
•
4
The FineWeb Datasets: Decanting the Web for the Finest Text Data at
Scale
Paper
•
2406.17557
•
Published
•
99
OpenCSG Chinese Corpus: A Series of High-quality Chinese Datasets for
LLM Training
Paper
•
2501.08197
•
Published
•
9
Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon
Pretraining Dataset
Paper
•
2412.02595
•
Published
•
5
DataComp-LM: In search of the next generation of training sets for
language models
Paper
•
2406.11794
•
Published
•
55
Improving Pretraining Data Using Perplexity Correlations
Paper
•
2409.05816
•
Published
Rethinking Reflection in Pre-Training
Paper
•
2504.04022
•
Published
•
80
START: Self-taught Reasoner with Tools
Paper
•
2503.04625
•
Published
•
113
Large Language Model Agent: A Survey on Methodology, Applications and
Challenges
Paper
•
2503.21460
•
Published
•
83
Babel: Open Multilingual Large Language Models Serving Over 90% of
Global Speakers
Paper
•
2503.00865
•
Published
•
64
A Comprehensive Survey on Long Context Language Modeling
Paper
•
2503.17407
•
Published
•
49
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning
Paper
•
2503.15558
•
Published
•
50
Open Deep Search: Democratizing Search with Open-source Reasoning Agents
Paper
•
2503.20201
•
Published
•
48
Understanding the Thinking Process of Reasoning Models: A Perspective
from Schoenfeld's Episode Theory
Paper
•
2509.14662
•
Published
•
13
AlphaApollo: Orchestrating Foundation Models and Professional Tools into
a Self-Evolving System for Deep Agentic Reasoning
Paper
•
2510.06261
•
Published
•
5