skip to main content


Title: Multi-level Scene Modeling and Matching for Smartphone-Based Indoor Localization
Accurate indoor positioning has attracted a lot of attention for a variety of indoor location-based applications, with the rapid development of mobile devices and their onboard sensors. A hybrid indoor localization method is proposed based on single off-the-shelf smartphone, which takes advantage of its various onboard sensors, including camera, gyroscope and accelerometer. The proposed approach integrates three components: visual-inertial odometry (VIO), point-based area mapping, and plane-based area mapping. A simplified RANSAC strategy is employed in plane matching for the sake of processing time. Since Apple's augmented reality platform ARKit has many powerful high-level APIs on world tracking, plane detection and 3D modeling, a practical smartphone app for indoor localization is developed on an iPhone that can run ARKit. Experimental results demonstrate that our plane-based method can achieve an accuracy of about 0.3 meter, which is based on a much more lightweight model, but achieves more accurate results than the point-based model by directly using ARKit's area mapping. The size of the plane-based model is less than 2KB for a closed-loop corridor area of about 45m*15m, comparing to about 10MB of the point-based model.  more » « less
Award ID(s):
1827505 1737533
PAR ID:
10185639
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Proceedings - International Symposium on Mixed and Augmented Reality, ISMAR
Page Range / eLocation ID:
311 to 316
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The Georgia Tech Miniature Autonomous Blimp (GT-MAB) needs localization algorithms to navigate to way-points in an indoor environment without leveraging an external motion capture system. Indoor aerial robots often require a motion capture system for localization or employ simultaneous localization and mapping (SLAM) algorithms for navigation. The proposed strategy for GT-MAB localization can be accomplished using lightweight sensors on a weight-constrained platform like the GT-MAB. We train an end-to-end convolutional neural network (CNN) that predicts the horizontal position and heading of the GT-MAB using video collected by an onboard monocular RGB camera. On the other hand, the height of the GT-MAB is estimated from measurements through a time-of-flight (ToF) single-beam laser sensor. The monocular camera and the single-beam laser sensor are sufficient for the localization algorithm to localize the GT-MAB in real time, achieving the averaged 3D positioning errors to be less than 20 cm, and the averaged heading errors to be less than 3 degrees. With the accuracy of our proposed localization method, we are able to use simple proportional-integral-derivative controllers to control the GT-MAB for waypoint navigation. Experimental results on the waypoint following are provided, which demonstrates the use of a CNN as the primary localization method for estimating the pose of an indoor robot that successfully enables navigation to specified waypoints. 
    more » « less
  2. Agaian, Sos S. ; DelMarco, Stephen P. ; Asari, Vijayan K. (Ed.)
    High accuracy localization and user positioning tracking is critical in improving the quality of augmented reality environments. The biggest challenge facing developers is localizing the user based on visible surroundings. Current solutions rely on the Global Positioning System (GPS) for tracking and orientation. However, GPS receivers have an accuracy of about 10 to 30 meters, which is not accurate enough for augmented reality, which needs precision measured in millimeters or smaller. This paper describes the development and demonstration of a head-worn augmented reality (AR) based vision-aid indoor navigation system, which localizes the user without relying on a GPS signal. Commercially available augmented reality head-set allows individuals to capture the field of vision using the front-facing camera in a real-time manner. Utilizing captured image features as navigation-related landmarks allow localizing the user in the absence of a GPS signal. The proposed method involves three steps: a detailed front-scene camera data is collected and generated for landmark recognition; detecting and locating an individual’s current position using feature matching, and display arrows to indicate areas that require more data collects if needed. Computer simulations indicate that the proposed augmented reality-based vision-aid indoor navigation system can provide precise simultaneous localization and mapping in a GPS-denied environment. Keywords: Augmented-reality, navigation, GPS, HoloLens, vision, positioning system, localization 
    more » « less
  3. This paper proposes an AR-based real-time mobile system for assistive indoor navigation with target segmentation (ARMSAINTS) for both sighted and blind or low-vision (BLV) users to safely explore and navigate in an indoor environment. The solution comprises four major components: graph construction, hybrid modeling, real-time navigation and target segmentation. The system utilizes an automatic graph construction method to generate a graph from a 2D floorplan and the Delaunay triangulation-based localization method to provide precise localization with negligible error. The 3D obstacle detection method integrates the existing capability of AR with a 2D object detector and a semantic target segmentation model to detect and track 3D bounding boxes of obstacles and people to increase BLV safety and understanding when traveling in the indoor environment. The entire system does not require the installation and maintenance of expensive infrastructure, run in real-time on a smartphone, and can easily adapt to environmental changes. 
    more » « less
  4. Localization in urban environments is becoming increasingly important and used in tools such as ARCore [ 18 ], ARKit [ 34 ] and others. One popular mechanism to achieve accurate indoor localization and a map of the space is using Visual Simultaneous Localization and Mapping (Visual-SLAM). However, Visual-SLAM is known to be resource-intensive in memory and processing time. Furthermore, some of the operations grow in complexity over time, making it challenging to run on mobile devices continuously. Edge computing provides additional compute and memory resources to mobile devices to allow offloading tasks without the large latencies seen when offloading to the cloud. In this article, we present Edge-SLAM, a system that uses edge computing resources to offload parts of Visual-SLAM. We use ORB-SLAM2 [ 50 ] as a prototypical Visual-SLAM system and modify it to a split architecture between the edge and the mobile device. We keep the tracking computation on the mobile device and move the rest of the computation, i.e., local mapping and loop closing, to the edge. We describe the design choices in this effort and implement them in our prototype. Our results show that our split architecture can allow the functioning of the Visual-SLAM system long-term with limited resources without affecting the accuracy of operation. It also keeps the computation and memory cost on the mobile device constant, which would allow for the deployment of other end applications that use Visual-SLAM. We perform a detailed performance and resources use (CPU, memory, network, and power) analysis to fully understand the effect of our proposed split architecture. 
    more » « less
  5. Cost-effective sensors are capable of real-time capturing a variety of air quality-related modalities from different pollutant concentrations to indoor/outdoor humidity and temperature. Machine learning (ML) models are capable of performing air-quality "ahead-of-time" approximations. Undoubtedly, accurate indoor air quality approximation significantly helps provide a healthy indoor environment, optimize associated energy consumption, and offer human comfort. However, it is crucial to design an ML architecture to capture the domain knowledge, so-called problem physics. In this study, we propose six novel physics-based ML models for accurate indoor pollutant concentration approximations. The proposed models include an adroit combination of state-space concepts in physics, Gated Recurrent Units, and Decomposition techniques. The proposed models were illustrated using data collected from five offices in a commercial building in California. The proposed models are shown to be less complex, computationally more efficient, and more accurate than similar state-of-the-art transformer-based models. The superiority of the proposed models is due to their relatively light architecture (computational efficiency) and, more importantly, their ability to capture the underlying highly nonlinear patterns embedded in the often contaminated sensor-collected indoor air quality temporal data. 
    more » « less