LagerNVS: Latent Geometry for Fully Neural Real-time Novel View Synthesis
Abstract
Neural networks can perform 3D tasks like novel view synthesis without explicit 3D reconstruction, but incorporating 3D inductive biases through encoder-decoder architectures with pre-trained 3D reconstruction networks improves performance and enables real-time rendering.
Recent work has shown that neural networks can perform 3D tasks such as Novel View Synthesis (NVS) without explicit 3D reconstruction. Even so, we argue that strong 3D inductive biases are still helpful in the design of such networks. We show this point by introducing LagerNVS, an encoder-decoder neural network for NVS that builds on `3D-aware' latent features. The encoder is initialized from a 3D reconstruction network pre-trained using explicit 3D supervision. This is paired with a lightweight decoder, and trained end-to-end with photometric losses. LagerNVS achieves state-of-the-art deterministic feed-forward Novel View Synthesis (including 31.4 PSNR on Re10k), with and without known cameras, renders in real time, generalizes to in-the-wild data, and can be paired with a diffusion decoder for generative extrapolation.
Community
LagerNVS (CVPR 2026 https://arxiv.org/abs/2603.20176) is a generalizable, feed-forward, real-time Novel View Synthesis network which
- performs rendering in real time,
- generalizes to in-the-wild data,
- works with and without known source cameras,
- sets a new state-of-the-art among deterministic methods,
- can be paired with a diffusion decoder for generative extrapolation.
General model: https://huggingface.co/facebook/lagernvs_general_512 (links to other models on Github)
Get this paper in your agent:
hf papers read 2603.20176 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 3
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper