Collection of models and datasets for Beyond Binary Rewards: Training LMs to Reason about their Uncertainty
Mehul Damani PRO
mehuldamani
AI & ML interests
Reinforcement Learning, Large Language Models
Recent Activity
updated a model about 20 hours ago
mehuldamani/sft-qwen-vmaze-v1 published a model about 20 hours ago
mehuldamani/sft-qwen-vmaze-v1 published a model 1 day ago
mehuldamani/rlvr_multi_k3Organizations
None yet