floyed shen
floyed
AI & ML interests
None yet
Recent Activity
upvoted a collection 1 day ago
Qwen3.5 commented on
a paper
3 days ago
VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training submitted
a paper
4 days ago
VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training