Mashiro's picture

9

Mashiro

AlexMashiro

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards

upvoted a paper 7 days ago

RM-R1: Reward Modeling as Reasoning

upvoted a paper 12 days ago

Auto-Rubric: Learning to Extract Generalizable Criteria for Reward Modeling

View all activity

Organizations

None yet

upvoted a paper 2 days ago

Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards

Paper • 2601.06021 • Published 15 days ago • 43

upvoted a paper 7 days ago

RM-R1: Reward Modeling as Reasoning

Paper • 2505.02387 • Published May 5, 2025 • 80

upvoted a paper 12 days ago

Auto-Rubric: Learning to Extract Generalizable Criteria for Reward Modeling

Paper • 2510.17314 • Published Oct 20, 2025 • 2

upvoted a paper 19 days ago

Training AI Co-Scientists Using Rubric Rewards

Paper • 2512.23707 • Published 26 days ago • 21

upvoted a paper 24 days ago

InfiMed-ORBIT: Aligning LLMs on Open-Ended Complex Tasks via Rubric-Based Incremental Training

Paper • 2510.15859 • Published Oct 17, 2025 • 13

upvoted a paper 27 days ago

Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning

Paper • 2508.16949 • Published Aug 23, 2025 • 24

upvoted 2 papers about 2 months ago

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published Nov 24, 2025 • 61

Self-Rewarding Rubric-Based Reinforcement Learning for Open-Ended Reasoning

Paper • 2509.25534 • Published Sep 19, 2025 • 3

upvoted a paper 2 months ago

Chasing the Tail: Effective Rubric-based Reward Modeling for Large Language Model Post-Training

Paper • 2509.21500 • Published Sep 25, 2025 • 20