FlowRL: Matching Reward Distributions for LLM Reasoning Paper • 2509.15207 • Published Sep 18, 2025 • 116
Statistical Estimation of Adversarial Risk in Large Language Models under Best-of-N Sampling Paper • 2601.22636 • Published 23 days ago • 21
SEMA: Simple yet Effective Learning for Multi-Turn Jailbreak Attacks Paper • 2602.06854 • Published 15 days ago • 6
CoPE-VideoLM: Codec Primitives For Efficient Video Language Models Paper • 2602.13191 • Published 8 days ago • 29
Improving Data and Reward Design for Scientific Reasoning in Large Language Models Paper • 2602.08321 • Published 13 days ago • 40
microsoft/Phi-4-multimodal-instruct-onnx Automatic Speech Recognition • Updated 11 days ago • 140 • 88