11 10

Zhang Jiahui

zhangjiahuise

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off

liked a model 11 days ago

tencent/HY-Embodied-0.5

upvoted a paper 11 days ago

Training a Student Expert via Semi-Supervised Foundation Model Distillation

View all activity

Organizations

None yet

upvoted a paper 2 days ago

DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off

Paper • 2604.13902 • Published 11 days ago • 59

liked a model 11 days ago

tencent/HY-Embodied-0.5

Image-Text-to-Text • 4B • Updated 11 days ago • 2.67k • 903

upvoted a paper 11 days ago

Training a Student Expert via Semi-Supervised Foundation Model Distillation

Paper • 2604.03841 • Published 22 days ago • 10

liked a dataset 13 days ago

Congliu/Chinese-DeepSeek-R1-Distill-data-110k

Viewer • Updated Feb 21, 2025 • 110k • 1.18k • 744

upvoted a paper 13 days ago

ClawBench: Can AI Agents Complete Everyday Online Tasks?

Paper • 2604.08523 • Published 17 days ago • 259

liked a model 13 days ago

google/electra-base-discriminator

Updated Feb 29, 2024 • 48.9M • 99

upvoted a paper 15 days ago

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability

Paper • 2604.06628 • Published 18 days ago • 321

liked 2 datasets 17 days ago

jpena-173/synthetic-eyes-v1

Viewer • Updated 17 days ago • 1.2k • 1.05k • 1

allenai/dolma3_mix-6T-1025-7B

Updated Jan 15 • 34.7k • 48

liked a model 21 days ago

Qwen/Qwen-Image

Text-to-Image • Updated Aug 18, 2025 • 201k • • 2.47k

upvoted 2 papers 24 days ago

MonitorBench: A Comprehensive Benchmark for Chain-of-Thought Monitorability in Large Language Models

Paper • 2603.28590 • Published 26 days ago • 22

ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers

Paper • 2603.24414 • Published Mar 25 • 183

upvoted a paper 25 days ago

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published Mar 20 • 342

liked a dataset 25 days ago

hf-doc-build/doc-build

Updated about 2 hours ago • 311k • 34

liked a model 25 days ago

openai/whisper-large-v3-turbo

Automatic Speech Recognition • 0.8B • Updated Oct 4, 2024 • 7.01M • • 2.97k

upvoted a paper 25 days ago

Pixel-level Scene Understanding in One Token: Visual States Need What-is-Where Composition

Paper • 2603.13904 • Published Mar 14 • 4

upvoted a paper about 1 month ago

Demystifing Video Reasoning

Paper • 2603.16870 • Published Mar 17 • 370

upvoted 2 papers about 2 months ago

Does Your Reasoning Model Implicitly Know When to Stop Thinking?

Paper • 2602.08354 • Published Feb 9 • 264

A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 519

liked a model about 2 months ago

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27, 2025 • 3.96M • • 13.3k

Zhang Jiahui

AI & ML interests

Recent Activity

Organizations

zhangjiahuise's activity