Mitigating Perspective Distortion-induced Shape Ambiguity in Image Crops

Prakash, Aditya; Gupta, Arjun; Gupta, Saurabh

Citation Details

Objects undergo varying amounts of perspective distortion as they move across a camera's field of view. Models for predicting 3D from a single image often work with crops around the object of interest and ignore the location of the object in the camera's field of view. We note that ignoring this location information further exaggerates the inherent ambiguity in making 3D inferences from 2D images and can prevent models from even fitting to the training data. To mitigate this ambiguity, we propose Intrinsics-Aware Positional Encoding (KPE), which incorporates information about the location of crops in the image and camera intrinsics. Experiments on three popular 3D-from-a-single-image benchmarks: depth prediction on NYU, 3D object detection on KITTI & nuScenes, and predicting 3D shapes of articulated objects on ARCTIC, show the benefits of KPE. more »

Award ID(s):: 2143873

PAR ID:: 10581758

Author(s) / Creator(s):: Prakash, Aditya; Gupta, Arjun; Gupta, Saurabh

Publisher / Repository:: Proceedings of European Conference on Computer Vision (ECCV)

Date Published:: 2024-10-04

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this