skip to main content


Title: Indoor camera pose estimation via style‐transfer 3D models
Abstract

Many vision‐based indoor localization methods require tedious and comprehensive pre‐mapping of built environments. This research proposes a mapping‐free approach to estimating indoor camera poses based on a 3D style‐transferred building information model (BIM) and photogrammetry technique. To address the cross‐domain gap between virtual 3D models and real‐life photographs, a CycleGAN model was developed to transform BIM renderings into photorealistic images. A photogrammetry‐based algorithm was developed to estimate camera pose using the visual and spatial information extracted from the style‐transferred BIM. The experiments demonstrated the efficacy of CycleGAN in bridging the cross‐domain gap, which significantly improved performance in terms of image retrieval and feature correspondence detection. With the 3D coordinates retrieved from BIM, the proposed method can achieve near real‐time camera pose estimation with an accuracy of 1.38 m and 10.1° in indoor environments.

 
more » « less
Award ID(s):
1850008
PAR ID:
10253198
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Computer-Aided Civil and Infrastructure Engineering
Volume:
37
Issue:
3
ISSN:
1093-9687
Page Range / eLocation ID:
p. 335-353
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Localizing the camera in a known indoor environment is a key building block for scene mapping, robot navigation, AR, etc. Recent advances estimate the camera pose via optimization over the 2D/3D-3D correspondences established between the coordinates in 2D/3D camera space and 3D world space. Such a mapping is estimated with either a convolution neural network or a decision tree using only the static input image sequence, which makes these approaches vulnerable to dynamic indoor environments that are quite common yet challenging in the real world. To address the aforementioned issues, in this paper, we propose a novel outlier-aware neural tree which bridges the two worlds, deep learning and decision tree approaches. It builds on three important blocks: (a) a hierarchical space partition over the indoor scene to construct the decision tree; (b) a neural routing function, implemented as a deep classification network, employed for better 3D scene understanding; and (c) an outlier rejection module used to filter out dynamic points during the hierarchical routing process. Our proposed algorithm is evaluated on the RIO-10 benchmark developed for camera relocalization in dynamic indoor environments. It achieves robust neural routing through space partitions and outperforms the state-of-the-art approaches by around 30% on camera pose accuracy, while running comparably fast for evaluation. 
    more » « less
  2. Building Information Modelling (BIM) is an integrated informational process and plays a key role in enabling efficient planning and control of a project in the Architecture, Engineering, and Construction (AEC) domain. Industry Foundation Classes (IFC)-based BIM allows building information to be interoperable among different BIM applications. Different stakeholders take different responsibilities in a project and therefore keep different types of information to meet project requirements. In this paper, the authors proposed and adopted a six-step methodology to support BIM interoperability between architectural design and structural analysis at both AEC project level and information level, in which: (1) the intrinsic and extrinsic information transferred between architectural models and structural models were analyzed and demonstrated by a Business Process Model and Notation (BPMN) model that the authors developed; (2) the proposed technical routes with different combinations, and their applications to different project delivery methods provided new instruments to stakeholders in industry for efficient and accurate decision-making; (3) the material centered invariant signature with portability can improve information exchange between different data formats and models to support interoperable BIM applications; and (4) a developed formal material information representation and checking method was tested on a case study where its efficiency was demonstrated to outperform: (1) proprietary representations and information checking method based on a manual operation, and (2) MVD-based information checking method. The proposed invariant signatures-based material information representation and checking method brings a better efficiency for information transfer between architectural design and structural analysis, which can have significant positive effect on a project delivery, due to the frequent and iterative update of a project design. This improves the information transfer and coordination between architects and structural engineers and therefore the efficiency of the whole project. The proposed method can be extended and applied to other application phases and functions such as cost estimation, scheduling, and energy analysis. 
    more » « less
  3. de la Garza, J.M. (Ed.)
    There has been an increasing demand in building information modeling (BIM) for structural analysis. However, model exchange between architectural software and structural analysis software, which is an essential task in a construction project, is not fully interoperable yet. Various studies showed missing information and information inconsistency problems during conversion of models between different software; the lack of foundational methods enabling a seamless BIM interoperability between architectural design and structural analysis is evident. To address this gap and facilitate more use of BIM for structural analysis, the authors develop invariant signatures for architecture, engineering, and construction (AEC) objects and propose a new data-driven method to use invariant signatures for solving practical problems in BIM applications. The invariant signatures and the data-driven method were tested in developing the interoperable BIM support tool for structural analysis through an experiment. Ten models were created/adopted and used in this experiment, including five models for training and five models for testing. An information validation and mapping algorithm was developed based on invariant signatures and training models, which was then evaluated in the testing models. Compared with a manually created gold standard, results showed that the desired structural analysis software inputs were successfully generated using the algorithm with high accuracy. The invariant signatures of AEC objects can therefore be expected to serve as the foundation of seamless BIM interoperability. 
    more » « less
  4. Virtual reality is progressively more widely used to support embodied AI agents, such as robots, which frequently engage in ‘sim-to-real’ based learning approaches. At the same time, tools such as large vision-and-language models offer new capabilities that tie into a wide variety of tasks and capabilities. In order to understand how such agents can learn from simulated environments, we explore a language model’s ability to recover the type of object represented by a photorealistic 3D model as a function of the 3D perspective from which the model is viewed. We used photogrammetry to create 3D models of commonplace objects and rendered 2D images of these models from an fixed set of 420 virtual camera perspectives. A well-studied image and language model (CLIP) was used to generate text (i.e., prompts) corresponding to these images. Using multiple instances of various object classes, we studied which camera perspectives were most likely to return accurate text categorizations for each class of object. 
    more » « less
  5. Accurate indoor positioning has attracted a lot of attention for a variety of indoor location-based applications, with the rapid development of mobile devices and their onboard sensors. A hybrid indoor localization method is proposed based on single off-the-shelf smartphone, which takes advantage of its various onboard sensors, including camera, gyroscope and accelerometer. The proposed approach integrates three components: visual-inertial odometry (VIO), point-based area mapping, and plane-based area mapping. A simplified RANSAC strategy is employed in plane matching for the sake of processing time. Since Apple's augmented reality platform ARKit has many powerful high-level APIs on world tracking, plane detection and 3D modeling, a practical smartphone app for indoor localization is developed on an iPhone that can run ARKit. Experimental results demonstrate that our plane-based method can achieve an accuracy of about 0.3 meter, which is based on a much more lightweight model, but achieves more accurate results than the point-based model by directly using ARKit's area mapping. The size of the plane-based model is less than 2KB for a closed-loop corridor area of about 45m*15m, comparing to about 10MB of the point-based model. 
    more » « less