VLS: Steering Pretrained Robot Policies via Vision-Language Models
Paper • 2602.03973 • Published • 22
A Diffusion Policy checkpoint trained on the CALVIN benchmark, released as the frozen base policy used in VLS: Steering Pretrained Robot Policies via Vision-Language Models.
VLS is a training-free, inference-time framework that steers the sampling process of a frozen generative robot policy (such as this checkpoint) using trajectory-differentiable rewards synthesized by a vision-language model — no fine-tuning or weight updates required.