Monocularly Generated 3D High Level Semantic Model by Integrating Deep Learning Models and Traditional Vision Techniques

Alsheimer, Steven; Zhu, Zhigang

doi:10.1109/IST50367.2021.9651471

Citation Details

Monocularly Generated 3D High Level Semantic Model by Integrating Deep Learning Models and Traditional Vision Techniques

Scene reconstruction using Monodepth2 (Monocular Depth Inference) which provides depth maps from a single RGB camera, the outputs are filled with noise and inconsistencies. Instance segmentation using a Mask R-CNN (Region Based Convolution Neural Networks) deep model can provide object segmentation results in 2D but lacks 3D information. In this paper we propose to integrate the results of Instance segmentation via Mask R-CNN’s, CAD model Car Shape Alignment, and depth from Monodepth2 together with classical dynamic vision techniques to create a High-level Semantic Model with separability, robustness, consistency and saliency. The model is useful for both virtualized rendering, semantic augmented reality and automatic driving. Experimental results are provided to validate the approach. more »

Award ID(s):: 1827505 1737533

PAR ID:: 10346691

Author(s) / Creator(s):: Alsheimer, Steven; Zhu, Zhigang

Date Published:: 2021-08-24

Journal Name:: 2021 IEEE International Conference on Imaging Systems and Techniques (IST)

Page Range / eLocation ID:: 1 to 6

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript
Conference Paper:
https://doi.org/10.1109/IST50367.2021.9651471

More Like this