skip to main content


Title: A Systematic Review of the Convergence of Augmented Reality, Intelligent Virtual Agents, and the Internet of Things
In a seminal article on augmented reality (AR) [7], Ron Azuma defines AR as a variation of virtual reality (VR), which completely immerses a user inside a synthetic environment. Azuma says “In contrast, AR allows the user to see the real world, with virtual objects superimposed upon or composited with the real world” [7] (emphasis added). Typically, a user wears a tracked stereoscopic head-mounted display (HMD) or holds a smartphone, showing the real world through optical or video means, with superimposed graphics that provide the appearance of virtual content that is related to and registered with the real world. While AR has been around since the 1960s [72], it is experiencing a renaissance of development and consumer interest. With exciting products from Microsoft (HoloLens), Metavision (Meta 2), and others; Apple’s AR Developer’s Kit (ARKit); and well-funded startups like Magic Leap [54], the future is looking even brighter, expecting that AR technologies will be absorbed into our daily lives and have a strong influence on our society in the foreseeable future.  more » « less
Award ID(s):
1800961
NSF-PAR ID:
10105846
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Transactions on computational science and computational intelligence
ISSN:
2569-7080
Page Range / eLocation ID:
37
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This poster presents the use of Augmented Reality (AR) and Virtual Reality (VR) to tackle 4 amongst the “14 Grand Challenges for Engineering in the 21st Century” identified by National Academy of Engineering. AR and VR are the technologies of the present and the future. AR creates a composite view by adding digital content to a real world view, often by using the camera of a smartphone and VR creates an immersive view where the user’s view is often cut off from the real world. The 14 challenges identify areas of science and technology that are achievable and sustainable to assist people and the planet to prosper. The 4 challenges tackled using AR/VR application in this poster are: Enhance virtual reality, Advance personalized learning, Provide access to clean water, and Make solar energy affordable. The solar system VR application is aimed at tackling two of the engineering challenges: (1) Enhance virtual reality and (2) Advance personalized learning. The VR application assists the user in visualizing and understanding our solar system by using a VR headset. It includes an immersive 360 degree view of our solar system where the user can use controllers to interact with celestial bodies-related information and to teleport to different points in the space to have a closer look at the planets and the Sun. The user has six degrees of freedom. The AR application for water tackles the engineering challenge: “Provide access to clean water”. The AR water application shows information on drinking water accessibility and the eco-friendly usage of bottles over plastic cups within the department buildings inside Auburn University. The user of the application has an augmented view of drinking water information on a smartphone. Every time the user points the smartphone camera towards a building, the application will render a composite view with drinking water information associated to the building. The Sun path visualization AR application tackles the engineering challenge: “Make solar energy affordable”. The application helps the user visualize sun path at a selected time and location. The sun path is augmented in the camera view of the device when the user points the camera towards the sky. The application provides information on sun altitude and azimuth. Also, it provides the user with sunrise and sunset data for a selected day. The information provided by the application can aid the user with effective solar panel placement. Using AR and VR technology to tackle these challenges enhances the user experience. The information from these applications are better curated and easily visualized, thus readily understandable by the end user. Therefore, usage of AR and VR technology to tackle these type of engineering challenges looks promising. 
    more » « less
  2. While tremendous advances in visual and auditory realism have been made for virtual and augmented reality (VR/AR), introducing a plausible sense of physicality into the virtual world remains challenging. Closing the gap between real-world physicality and immersive virtual experience requires a closed interaction loop: applying user-exerted physical forces to the virtual environment and generating haptic sensations back to the users. However, existing VR/AR solutions either completely ignore the force inputs from the users or rely on obtrusive sensing devices that compromise user experience. By identifying users' muscle activation patterns while engaging in VR/AR, we design a learning-based neural interface for natural and intuitive force inputs. Specifically, we show that lightweight electromyography sensors, resting non-invasively on users' forearm skin, inform and establish a robust understanding of their complex hand activities. Fuelled by a neural-network-based model, our interface can decode finger-wise forces in real-time with 3.3% mean error, and generalize to new users with little calibration. Through an interactive psychophysical study, we show that human perception of virtual objects' physical properties, such as stiffness, can be significantly enhanced by our interface. We further demonstrate that our interface enables ubiquitous control via finger tapping. Ultimately, we envision our findings to push forward research towards more realistic physicality in future VR/AR. 
    more » « less
  3. In-person human interaction relies on our spatial perception of each other and our surroundings. Current remote communication tools partially address each of these aspects. Video calls convey real user representations but without spatial interactions. Augmented and Virtual Reality (AR/VR) experiences are immersive and spatial but often use virtual environments and characters instead of real-life representations. Bridging these gaps, we introduce DualStream, a system for synchronous mobile AR remote communication that captures, streams, and displays spatial representations of users and their surroundings. DualStream supports transitions between user and environment representations with different levels of visuospatial fidelity, as well as the creation of persistent shared spaces using environment snapshots. We demonstrate how DualStream can enable spatial communication in real-world contexts, and support the creation of blended spaces for collaboration. A formative evaluation of DualStream revealed that users valued the ability to interact spatially and move between representations, and could see DualStream fitting into their own remote communication practices in the near future. Drawing from these findings, we discuss new opportunities for designing more widely accessible spatial communication tools, centered around the mobile phone. 
    more » « less
  4. Demand is growing for markerless augmented reality (AR) experiences, but designers of the real-world spaces that host them still have to rely on inexact, qualitative guidelines on the visual environment to try and facilitate accurate pose tracking. Furthermore, the need for visual texture to support markerless AR is often at odds with human aesthetic preferences, and understanding how to balance these competing requirements is challenging due to the siloed nature of the relevant research areas. To address this, we present an integrated design methodology for AR spaces, that incorporates both tracking and human factors into the design process. On the tracking side, we develop the first VI-SLAM evaluation technique that combines the flexibility and control of virtual environments with real inertial data. We use it to perform systematic, quantitative experiments on the effect of visual texture on pose estimation accuracy; through 2000 trials in 20 environments, we reveal the impact of both texture complexity and edge strength. On the human side, we show how virtual reality (VR) can be used to evaluate user satisfaction with environments, and highlight how this can be tailored to AR research and use cases. Finally, we demonstrate our integrated design methodology with a case study on AR museum design, in which we conduct both VI-SLAM evaluations and a VR-based user study of four different museum environments. 
    more » « less
  5. As augmented and virtual reality (AR/VR) technology matures, a method is desired to represent real-world persons visually and aurally in a virtual scene with high fidelity to craft an immersive and realistic user experience. Current technologies leverage camera and depth sensors to render visual representations of subjects through avatars, and microphone arrays are employed to localize and separate high-quality subject audio through beamforming. However, challenges remain in both realms. In the visual domain, avatars can only map key features (e.g., pose, expression) to a predetermined model, rendering them incapable of capturing the subjects’ full details. Alternatively, high-resolution point clouds can be utilized to represent human subjects. However, such three-dimensional data is computationally expensive to process. In the realm of audio, sound source separation requires prior knowledge of the subjects’ locations. However, it may take unacceptably long for sound source localization algorithms to provide this knowledge, which can still be error-prone, especially with moving objects. These challenges make it difficult for AR systems to produce real-time, high-fidelity representations of human subjects for applications such as AR/VR conferencing that mandate negligible system latency. We present Acuity, a real-time system capable of creating high-fidelity representations of human subjects in a virtual scene both visually and aurally. Acuity isolates subjects from high-resolution input point clouds. It reduces the processing overhead by performing background subtraction at a coarse resolution, then applying the detected bounding boxes to fine-grained point clouds. Meanwhile, Acuity leverages an audiovisual sensor fusion approach to expedite sound source separation. The estimated object location in the visual domain guides the acoustic pipeline to isolate the subjects’ voices without running sound source localization. Our results demonstrate that Acuity can isolate multiple subjects’ high-quality point clouds with a maximum latency of 70 ms and average throughput of over 25 fps, while separating audio in less than 30 ms. We provide the source code of Acuity at: https://github.com/nesl/Acuity. 
    more » « less