Mobile Augmented Reality (AR) demands realistic rendering of virtual content that seamlessly blends into the physical environment. For this reason, AR headsets and recent smartphones are increasingly equipped with Time-of-Flight (ToF) cameras to acquire depth maps of a scene in real-time. ToF cameras are cheap and fast, however, they suffer from several issues that affect the quality of depth data, ultimately hampering their use for mobile AR. Among them, scale errors of virtual objects - appearing much bigger or smaller than what they should be - are particularly noticeable and unpleasant. This article specifically addresses these challenges by proposing InDepth, a real-time depth inpainting system based on edge computing. InDepth employs a novel deep neural network (DNN) architecture to improve the accuracy of depth maps obtained from ToF cameras. The DNN fills holes and corrects artifacts in the depth maps with high accuracy and eight times lower inference time than the state of the art. An extensive performance evaluation in real settings shows that InDepth reduces the mean absolute error by a factor of four with respect to ARCore DepthLab. Finally, a user study reveals that InDepth is effective in rendering correctly-scaled virtual objects, outperforming DepthLab.
more »
« less
Will it Move?: Indoor Scene Characterization for Hologram Stability in Mobile AR
Mobile Augmented Reality (AR) provides immersive experiences by aligning virtual content (holograms) with a view of the real world. When a user places a hologram it is usually expected that like a real object, it remains in the same place. However, positional errors frequently occur due to inaccurate environment mapping and device localization, to a large extent determined by the properties of natural visual features in the scene. In this demonstration we present SceneIt, the first visual environment rating system for mobile AR based on predictions of hologram positional error magnitude. SceneIt allows users to determine if virtual content placed in their environment will drift noticeably out of position, without requiring them to place that content. It shows that the severity of positional error for a given visual environment is predictable, and that this prediction can be calculated with sufficiently high accuracy and low latency to be useful in mobile AR applications.
more »
« less
- PAR ID:
- 10296638
- Date Published:
- Journal Name:
- Proceedings of the 22nd International Workshop on Mobile Computing Systems and Applications
- Page Range / eLocation ID:
- 174 to 176
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
A Global Correction Framework for Camera Registration in Video See-Through Augmented Reality SystemsAbstract Augmented reality (AR) enhances the user’s perception of the real environment by superimposing virtual images generated by computers. These virtual images provide additional visual information that complements the real-world view. AR systems are rapidly gaining popularity in various manufacturing fields such as training, maintenance, assembly, and robot programming. In some AR applications, it is crucial for the invisible virtual environment to be precisely aligned with the physical environment to ensure that human users can accurately perceive the virtual augmentation in conjunction with their real surroundings. The process of achieving this accurate alignment is known as calibration. During some robotics applications using AR, we observed instances of misalignment in the visual representation within the designated workspace. This misalignment can potentially impact the accuracy of the robot’s operations during the task. Based on the previous research on AR-assisted robot programming systems, this work investigates the sources of misalignment errors and presents a simple and efficient calibration procedure to reduce the misalignment accuracy in general video see-through AR systems. To accurately superimpose virtual information onto the real environment, it is necessary to identify the sources and propagation of errors. In this work, we outline the linear transformation and projection of each point from the virtual world space to the virtual screen coordinates. An offline calibration method is introduced to determine the offset matrix from the head-mounted display (HMD) to the camera, and experiments are conducted to validate the improvement achieved through the calibration process.more » « less
-
Mobile augmented reality (AR) has been attracting considerable attention from industry and academia due to its potential to provide vibrant immersive experiences that seamlessly blend physical and virtual worlds. In this paper we focus on creating contextual and personalized AR experiences via edge-based on-demand provisioning of holographic content most appropriate for the conditions and/or most matching user interests. We present edge-based hologram provisioning and pre-provisioning frameworks we developed for Google ARCore and Magic Leap One AR experiences, and describe open challenges and research directions associated with this approach to holographic content storage and transfer. The code we have developed for this paper is available online.more » « less
-
High-quality environment lighting is essential for creating immersive mobile augmented reality (AR) experiences. However, achieving visually coherent estimation for mobile AR is challenging due to several key limitations in AR device sensing capabilities, including low camera FoV and limited pixel dynamic ranges. Recent advancements in generative AI, which can generate high-quality images from different types of prompts, including texts and images, present a potential solution for high-quality lighting estimation. Still, to effectively use generative image diffusion models, we must address two key limitations of content quality and slow inference. In this work, we design and implement a generative lighting estimation system called CleAR that can produce high-quality, diverse environment maps in the format of 360° HDR images. Specifically, we design a two-step generation pipeline guided by AR environment context data to ensure the output aligns with the physical environment's visual context and color appearance. To improve the estimation robustness under different lighting conditions, we design a real-time refinement component to adjust lighting estimation results on AR devices. To train and test our generative models, we curate a large-scale environment lighting estimation dataset with diverse lighting conditions. Through a combination of quantitative and qualitative evaluations, we show that CleAR outperforms state-of-the-art lighting estimation methods on both estimation accuracy, latency, and robustness, and is rated by 31 participants as producing better renderings for most virtual objects. For example, CleAR achieves 51% to 56% accuracy improvement on virtual object renderings across objects of three distinctive types of materials and reflective properties. CleAR produces lighting estimates of comparable or better quality in just 3.2 seconds---over 110X faster than state-of-the-art methods. Moreover, CleAR supports real-time refinement of lighting estimation results, ensuring robust and timely updates for AR applications.more » « less
-
Virtual content instability caused by device pose tracking error remains a prevalent issue in markerless augmented reality (AR), especially on smartphones and tablets. However, when examining environments which will host AR experiences, it is challenging to determine where those instability artifacts will occur; we rarely have access to ground truth pose to measure pose error, and even if pose error is available, traditional visualizations do not connect that data with the real environment, limiting their usefulness. To address these issues we present SiTAR (Situated Trajectory Analysis for Augmented Reality), the first situated trajectory analysis system for AR that incorporates estimates of pose tracking error. We start by developing the first uncertainty-based pose error estimation method for visual-inertial simultaneous localization and mapping (VI-SLAM), which allows us to obtain pose error estimates without ground truth; we achieve an average accuracy of up to 96.1% and an average FI score of up to 0.77 in our evaluations on four VI-SLAM datasets. Next, we present our SiTAR system, implemented for ARCore devices, combining a backend that supplies uncertainty-based pose error estimates with a frontend that generates situated trajectory visualizations. Finally, we evaluate the efficacy of SiTAR in realistic conditions by testing three visualization techniques in an in-the-wild study with 15 users and 13 diverse environments; this study reveals the impact both environment scale and the properties of surfaces present can have on user experience and task performance.more » « less
An official website of the United States government

