7 13

Kexin Huang

737443h

https://kexinhuang02.github.io

AI & ML interests

None yet

Recent Activity

upvoted a paper about 23 hours ago

Sparse but Critical: A Token-Level Analysis of Distributional Shifts in RLVR Fine-Tuning of LLMs

upvoted a paper 1 day ago

On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation

liked a Space 6 days ago

HuggingFaceFW/finephrase

View all activity

Organizations

None yet

upvoted a paper about 23 hours ago

Sparse but Critical: A Token-Level Analysis of Distributional Shifts in RLVR Fine-Tuning of LLMs

Paper • 2603.22446 • Published 3 days ago • 4

upvoted a paper 1 day ago

On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation

Paper • 2603.22117 • Published 3 days ago • 22

liked a Space 6 days ago

The Synthetic Data Playbook: Generating Trillions of the Finest Tokens

📝

208

Explore synthetic data experiments as an interactive bookshelf

liked 3 datasets about 2 months ago

authored 3 papers 6 months ago

RePO: ReLU-based Preference Optimization

Paper • 2503.07426 • Published Mar 10, 2025 • 2

SPRec: Self-Play to Debias LLM-based Recommendation

Paper • 2412.09243 • Published Dec 12, 2024

Quantile Advantage Estimation for Entropy-Safe Reasoning

Paper • 2509.22611 • Published Sep 26, 2025 • 119

upvoted a paper 6 months ago

Quantile Advantage Estimation for Entropy-Safe Reasoning

Paper • 2509.22611 • Published Sep 26, 2025 • 119

upvoted a paper 10 months ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2, 2025 • 190

upvoted an article 11 months ago

Article

Visualize and understand GPU memory in PyTorch

Dec 24, 2024

•

269

upvoted a paper about 1 year ago

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6, 2025 • 113

liked a dataset about 1 year ago

open-r1/OpenR1-Math-220k

Viewer • Updated Feb 18, 2025 • 450k • 12.8k • 716

liked a Space about 1 year ago

Scaling test-time compute

📈

594

Run advanced search strategies to boost LLM problem solving

upvoted a paper over 1 year ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 377

liked 4 datasets over 1 year ago

codeparrot/apps

Updated Oct 20, 2022 • 17.3k • 201

google-research-datasets/mbpp

Viewer • Updated Jan 4, 2024 • 1.4k • 256k • 223

openai/openai_humaneval

Viewer • Updated Jan 4, 2024 • 164 • 232k • 376

deepmind/code_contests

Viewer • Updated Jun 11, 2023 • 4.04k • 172k • 220