Song Jiang's picture

Song Jiang

songjiang

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning

upvoted a paper 5 months ago

SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models

upvoted a paper 5 months ago

Large Reasoning Models Learn Better Alignment from Flawed Thinking

View all activity

Organizations

None yet

upvoted a paper 1 day ago

ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning

Paper • 2602.21534 • Published 3 days ago • 20

upvoted 2 papers 5 months ago

SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models

Paper • 2510.09541 • Published Oct 10, 2025 • 17

Large Reasoning Models Learn Better Alignment from Flawed Thinking

Paper • 2510.00938 • Published Oct 1, 2025 • 59

authored a paper 11 months ago

SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks

Paper • 2503.15478 • Published Mar 19, 2025 • 13

liked a model about 2 years ago

liuhaotian/llava-v1.5-mlp2x-336px-pretrain-vicuna-7b-v1.5

Text Generation • Updated Oct 5, 2023 • 47 • 23

authored a paper over 2 years ago

LLM-Rec: Personalized Recommendation via Prompting Large Language Models

Paper • 2307.15780 • Published Jul 24, 2023 • 28

liked 3 models over 2 years ago

meta-llama/Llama-2-70b-hf

Text Generation • Updated Apr 17, 2024 • 13.4k • 856

allenai/tulu-7b

Text Generation • Updated Jun 20, 2023 • 14 • 9

allenai/tulu-65b

Text Generation • Updated Jun 29, 2023 • 13 • 21