-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 189 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
Collections
Discover the best community collections!
Collections including paper arxiv:2512.05145
-
MemLoRA: Distilling Expert Adapters for On-Device Memory Systems
Paper • 2512.04763 • Published • 3 -
VisPlay: Self-Evolving Vision-Language Models from Images
Paper • 2511.15661 • Published • 42 -
VersatileFFN: Achieving Parameter Efficiency in LLMs via Adaptive Wide-and-Deep Reuse
Paper • 2512.14531 • Published • 12 -
Improving Recursive Transformers with Mixture of LoRAs
Paper • 2512.12880 • Published • 5
-
Depth Anything V2
Paper • 2406.09414 • Published • 103 -
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels
Paper • 2406.09415 • Published • 51 -
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion
Paper • 2406.04338 • Published • 39 -
SAM 2: Segment Anything in Images and Videos
Paper • 2408.00714 • Published • 120
-
Guided Self-Evolving LLMs with Minimal Human Supervision
Paper • 2512.02472 • Published • 51 -
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search
Paper • 2509.25454 • Published • 141 -
Video Reasoning without Training
Paper • 2510.17045 • Published • 7 -
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 270
-
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization
Paper • 2510.08540 • Published • 109 -
Diffusion Transformers with Representation Autoencoders
Paper • 2510.11690 • Published • 165 -
Spotlight on Token Perception for Multimodal Reinforcement Learning
Paper • 2510.09285 • Published • 36 -
Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation
Paper • 2510.17354 • Published • 33
-
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Paper • 2410.17637 • Published • 35 -
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Paper • 2411.10442 • Published • 87 -
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
Paper • 2411.18203 • Published • 40 -
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Paper • 2411.14432 • Published • 25
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 189 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
Guided Self-Evolving LLMs with Minimal Human Supervision
Paper • 2512.02472 • Published • 51 -
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search
Paper • 2509.25454 • Published • 141 -
Video Reasoning without Training
Paper • 2510.17045 • Published • 7 -
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 270
-
MemLoRA: Distilling Expert Adapters for On-Device Memory Systems
Paper • 2512.04763 • Published • 3 -
VisPlay: Self-Evolving Vision-Language Models from Images
Paper • 2511.15661 • Published • 42 -
VersatileFFN: Achieving Parameter Efficiency in LLMs via Adaptive Wide-and-Deep Reuse
Paper • 2512.14531 • Published • 12 -
Improving Recursive Transformers with Mixture of LoRAs
Paper • 2512.12880 • Published • 5
-
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization
Paper • 2510.08540 • Published • 109 -
Diffusion Transformers with Representation Autoencoders
Paper • 2510.11690 • Published • 165 -
Spotlight on Token Perception for Multimodal Reinforcement Learning
Paper • 2510.09285 • Published • 36 -
Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation
Paper • 2510.17354 • Published • 33
-
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Paper • 2410.17637 • Published • 35 -
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Paper • 2411.10442 • Published • 87 -
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
Paper • 2411.18203 • Published • 40 -
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Paper • 2411.14432 • Published • 25
-
Depth Anything V2
Paper • 2406.09414 • Published • 103 -
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels
Paper • 2406.09415 • Published • 51 -
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion
Paper • 2406.04338 • Published • 39 -
SAM 2: Segment Anything in Images and Videos
Paper • 2408.00714 • Published • 120