Zero-Shot Monocular Scene Flow Estimation in the Wild

Liang, Yiqing; Badki, Abhishek; Su, Hang; Tompkin, James; Gallo, Orazio

Citation Details

This content will become publicly available on June 11, 2026

Zero-Shot Monocular Scene Flow Estimation in the Wild

Large models have shown generalization across datasets for many low-level vision tasks, like depth estimation, but no such general models exist for scene flow. Even though scene flow prediction has wide potential, its practical use is limited because of the lack of generalization of current predictive models. We identify three key challenges and propose solutions for each. First, we create a method that jointly estimates geometry and motion for accurate prediction. Second, we alleviate scene flow data scarcity with a data recipe that affords us 1M annotated training samples across diverse synthetic scenes. Third, we evaluate different parameterizations for scene flow prediction and adopt a natural and effective parameterization. Our model outperforms existing methods as well as baselines built on large-scale models in terms of 3D end-point error, and shows zero-shot generalization to the casually captured videos from DAVIS and the robotic manipulation scenes from RoboTAP. Overall, our approach makes scene flow prediction more practical in-the-wild. Website: https://research.nvidia.com/labs/lpr/zero msf/ more »

Award ID(s):: 2144956

PAR ID:: 10580897

Author(s) / Creator(s):: Liang, Yiqing; Badki, Abhishek; Su, Hang; Tompkin, James; Gallo, Orazio

Publisher / Repository:: IEEE/CVF Computer Vision and Pattern Recognition

Date Published:: 2025-06-11

Format(s):: Medium: X

Location:: Nashville, TN

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on June 11, 2026
Conference Paper:
The DOI is not currently available.

More Like this