skip to main content


Title: Integrated Design of Augmented Reality Spaces Using Virtual Environments
Demand is growing for markerless augmented reality (AR) experiences, but designers of the real-world spaces that host them still have to rely on inexact, qualitative guidelines on the visual environment to try and facilitate accurate pose tracking. Furthermore, the need for visual texture to support markerless AR is often at odds with human aesthetic preferences, and understanding how to balance these competing requirements is challenging due to the siloed nature of the relevant research areas. To address this, we present an integrated design methodology for AR spaces, that incorporates both tracking and human factors into the design process. On the tracking side, we develop the first VI-SLAM evaluation technique that combines the flexibility and control of virtual environments with real inertial data. We use it to perform systematic, quantitative experiments on the effect of visual texture on pose estimation accuracy; through 2000 trials in 20 environments, we reveal the impact of both texture complexity and edge strength. On the human side, we show how virtual reality (VR) can be used to evaluate user satisfaction with environments, and highlight how this can be tailored to AR research and use cases. Finally, we demonstrate our integrated design methodology with a case study on AR museum design, in which we conduct both VI-SLAM evaluations and a VR-based user study of four different museum environments.  more » « less
Award ID(s):
2046072 1908051 1903136
NSF-PAR ID:
10407033
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2022 IEEE International Symposium on Mixed and Augmented Reality (ISMAR)
Page Range / eLocation ID:
297 to 306
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Virtual content instability caused by device pose tracking error remains a prevalent issue in markerless augmented reality (AR), especially on smartphones and tablets. However, when examining environments which will host AR experiences, it is challenging to determine where those instability artifacts will occur; we rarely have access to ground truth pose to measure pose error, and even if pose error is available, traditional visualizations do not connect that data with the real environment, limiting their usefulness. To address these issues we present SiTAR (Situated Trajectory Analysis for Augmented Reality), the first situated trajectory analysis system for AR that incorporates estimates of pose tracking error. We start by developing the first uncertainty-based pose error estimation method for visual-inertial simultaneous localization and mapping (VI-SLAM), which allows us to obtain pose error estimates without ground truth; we achieve an average accuracy of up to 96.1% and an average FI score of up to 0.77 in our evaluations on four VI-SLAM datasets. Next, we present our SiTAR system, implemented for ARCore devices, combining a backend that supplies uncertainty-based pose error estimates with a frontend that generates situated trajectory visualizations. Finally, we evaluate the efficacy of SiTAR in realistic conditions by testing three visualization techniques in an in-the-wild study with 15 users and 13 diverse environments; this study reveals the impact both environment scale and the properties of surfaces present can have on user experience and task performance. 
    more » « less
  2. Mobile augmented reality (AR) has a wide range of promising applications, but its efficacy is subject to the impact of environment texture on both machine and human perception. Performance of the machine perception algorithm underlying accurate positioning of virtual content, visual-inertial SLAM (VI-SLAM), is known to degrade in low-texture conditions, but there is a lack of data in realistic scenarios. We address this through extensive experiments using a game engine-based emulator, with 112 textures and over 5000 trials. Conversely, human task performance and response times in AR have been shown to increase in environments perceived as textured. We investigate and provide encouraging evidence for invisible textures, which result in good VI-SLAM performance with minimal impact on human perception of virtual content. This arises from fundamental differences between VI-SLAM-based machine perception, and human perception as described by the contrast sensitivity function. Our insights open up exciting possibilities for deploying ambient IoT devices that display invisible textures, as part of systems which automatically optimize AR environments. 
    more » « less
  3. As augmented and virtual reality (AR/VR) technology matures, a method is desired to represent real-world persons visually and aurally in a virtual scene with high fidelity to craft an immersive and realistic user experience. Current technologies leverage camera and depth sensors to render visual representations of subjects through avatars, and microphone arrays are employed to localize and separate high-quality subject audio through beamforming. However, challenges remain in both realms. In the visual domain, avatars can only map key features (e.g., pose, expression) to a predetermined model, rendering them incapable of capturing the subjects’ full details. Alternatively, high-resolution point clouds can be utilized to represent human subjects. However, such three-dimensional data is computationally expensive to process. In the realm of audio, sound source separation requires prior knowledge of the subjects’ locations. However, it may take unacceptably long for sound source localization algorithms to provide this knowledge, which can still be error-prone, especially with moving objects. These challenges make it difficult for AR systems to produce real-time, high-fidelity representations of human subjects for applications such as AR/VR conferencing that mandate negligible system latency. We present Acuity, a real-time system capable of creating high-fidelity representations of human subjects in a virtual scene both visually and aurally. Acuity isolates subjects from high-resolution input point clouds. It reduces the processing overhead by performing background subtraction at a coarse resolution, then applying the detected bounding boxes to fine-grained point clouds. Meanwhile, Acuity leverages an audiovisual sensor fusion approach to expedite sound source separation. The estimated object location in the visual domain guides the acoustic pipeline to isolate the subjects’ voices without running sound source localization. Our results demonstrate that Acuity can isolate multiple subjects’ high-quality point clouds with a maximum latency of 70 ms and average throughput of over 25 fps, while separating audio in less than 30 ms. We provide the source code of Acuity at: https://github.com/nesl/Acuity. 
    more » « less
  4. Mobile augmented reality (AR) has the potential to enable immersive, natural interactions between humans and cyber-physical systems. In particular markerless AR, by not relying on fiducial markers or predefined images, provides great convenience and flexibility for users. However, unwanted virtual object movement frequently occurs in markerless smartphone AR due to inaccurate scene understanding, and resulting errors in device pose tracking. We examine the factors which may affect virtual object stability, design experiments to measure it, and conduct systematic quantitative characterizations across six different user actions and five different smartphone configurations. Our study demonstrates noticeable instances of spatial instability in virtual objects in all but the simplest settings (with position errors of greater than 10cm even on the best-performing smartphones), and underscores the need for further enhancements to pose tracking algorithms for smartphone-based markerless AR. 
    more » « less
  5. Mobile augmented reality (AR) has the potential to enable immersive, natural interactions between humans and cyber-physical systems. In particular markerless AR, by not relying on fiducial markers or predefined images, provides great convenience and flexibility for users. However, unwanted virtual object movement frequently occurs in markerless smartphone AR due to inaccurate scene understanding, and resulting errors in device pose tracking. We examine the factors which may affect virtual object stability, design experiments to measure it, and conduct systematic quantitative characterizations across six different user actions and five different smartphone configurations. Our study demonstrates noticeable instances of spatial instability in virtual objects in all but the simplest settings (with position errors of greater than 10cm even on the best-performing smartphones), and underscores the need for further enhancements to pose tracking algorithms for smartphone-based markerless AR. 
    more » « less