jkrs's picture

2 6

jkrs

jkrs

·

AI & ML interests

Reinforcement Learning

Recent Activity

upvoted a paper 2 days ago

VI-CuRL: Stabilizing Verifier-Independent RL Reasoning via Confidence-Guided Variance Reduction

upvoted a paper 5 months ago

Reinforcement Learning with Verifiable yet Noisy Rewards under Imperfect Verifiers

liked a dataset over 1 year ago

Anthropic/hh-rlhf

View all activity

Organizations

None yet

models 1

jkrs/output

Updated Oct 20, 2022

datasets 0

None public yet