2 11 4

YUANZHE HU

ai-hyz

AI & ML interests

None yet

Recent Activity

upvoted a paper 17 days ago

When Reasoning Meets Its Laws

upvoted a paper 3 months ago

When Benchmarks Age: Temporal Misalignment through Large Language Model Factuality Evaluation

updated a dataset 3 months ago

ai-hyz/MemoryAgentBench

View all activity

Organizations

None yet

upvoted a paper 17 days ago

When Reasoning Meets Its Laws

Paper • 2512.17901 • Published 20 days ago • 56

upvoted a paper 3 months ago

When Benchmarks Age: Temporal Misalignment through Large Language Model Factuality Evaluation

Paper • 2510.07238 • Published Oct 8, 2025 • 14

updated a dataset 3 months ago

ai-hyz/MemoryAgentBench

Viewer • Updated Oct 7, 2025 • 146 • 12.9k • 22

upvoted a paper 3 months ago

BiasFreeBench: a Benchmark for Mitigating Bias in Large Language Model Responses

Paper • 2510.00232 • Published Sep 30, 2025 • 15

upvoted a collection 3 months ago

Representation & Optimization

Collection

Understanding about representation sheds light on optimization • 114 items • Updated about 17 hours ago • 5

upvoted a paper 3 months ago

Who's Your Judge? On the Detectability of LLM-Generated Judgments

Paper • 2509.25154 • Published Sep 29, 2025 • 29

updated a collection 3 months ago

Agentic Memory

Collection

3 items • Updated Oct 1, 2025

authored a paper 3 months ago

Mem-α: Learning Memory Construction via Reinforcement Learning

Paper • 2509.25911 • Published Sep 30, 2025 • 14

upvoted 2 papers 3 months ago

Mem-α: Learning Memory Construction via Reinforcement Learning

Paper • 2509.25911 • Published Sep 30, 2025 • 14

EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning

Paper • 2509.22576 • Published Sep 26, 2025 • 134

updated a dataset 4 months ago

ai-hyz/MemoryAgentBench_Sep19

Viewer • Updated Sep 20, 2025 • 130 • 8

published a dataset 4 months ago

ai-hyz/MemoryAgentBench_Sep19

Viewer • Updated Sep 20, 2025 • 130 • 8

updated a dataset 4 months ago

ai-hyz/cr-train-list

Viewer • Updated Sep 11, 2025 • 50 • 2

published a dataset 4 months ago

ai-hyz/cr-train-list

Viewer • Updated Sep 11, 2025 • 50 • 2

upvoted a paper 4 months ago

WildScore: Benchmarking MLLMs in-the-Wild Symbolic Music Reasoning

Paper • 2509.04744 • Published Sep 5, 2025 • 11

liked a model 4 months ago

LLM360/K2-Think

Text Generation • 33B • Updated Nov 19, 2025 • 437 • 364

liked a Space 5 months ago

The Ultra-Scale Playbook

🌌

3.63k

The ultimate guide to training LLM on large GPU Clusters

upvoted a paper 6 months ago

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7, 2025 • 64

New activity in ai-hyz/MemoryAgentBench 6 months ago

Enhance dataset card: Add task categories, tags, library_name, and sample usage

#2 opened 6 months ago by

nielsr

YUANZHE HU

AI & ML interests

Recent Activity

Organizations

ai-hyz's activity

The Ultra-Scale Playbook

Enhance dataset card: Add task categories, tags, library_name, and sample usage