aopolin-lv's picture

aopolin-lv

aopolin-lv

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

LARY: A Latent Action Representation Yielding Benchmark for Generalizable Vision-to-Action Alignment

liked a dataset 5 days ago

meituan-longcat/LARYBench

upvoted a paper 11 days ago

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

View all activity

Organizations

upvoted a paper 4 days ago

LARY: A Latent Action Representation Yielding Benchmark for Generalizable Vision-to-Action Alignment

Paper • 2604.11689 • Published 6 days ago • 11

upvoted a paper 11 days ago

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

Paper • 2604.05015 • Published 13 days ago • 233

upvoted a paper 17 days ago

Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis

Paper • 2603.29620 • Published 19 days ago • 46

upvoted a paper 23 days ago

MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data

Paper • 2603.25319 • Published 24 days ago • 32

upvoted a paper about 1 month ago

Anatomy of a Lie: A Multi-Stage Diagnostic Framework for Tracing Hallucinations in Vision-Language Models

Paper • 2603.15557 • Published Mar 16 • 29

upvoted a paper 3 months ago

RoboVIP: Multi-View Video Generation with Visual Identity Prompting Augments Robot Manipulation

Paper • 2601.05241 • Published Jan 8 • 24

upvoted 2 papers 4 months ago

Act2Goal: From World Model To General Goal-conditioned Policy

Paper • 2512.23541 • Published Dec 29, 2025 • 23

LoGoPlanner: Localization Grounded Navigation Policy with Metric-aware Visual Geometry

Paper • 2512.19629 • Published Dec 22, 2025 • 26

upvoted a paper 5 months ago

VLASH: Real-Time VLAs via Future-State-Aware Asynchronous Inference

Paper • 2512.01031 • Published Nov 30, 2025 • 26

upvoted a paper 7 months ago

F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions

Paper • 2509.06951 • Published Sep 8, 2025 • 33

upvoted a paper 8 months ago

EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control

Paper • 2508.21112 • Published Aug 28, 2025 • 78

upvoted a paper 9 months ago

StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling

Paper • 2507.05240 • Published Jul 7, 2025 • 48