Mobile Augmented Reality (AR), which overlays digital information with real-world scenes surrounding a user, provides an enhanced mode of interaction with the ambient world. Contextual AR applications rely on image recognition to identify objects in the view of the mobile device. In practice, due to image distortions and device resource constraints, achieving high performance image recognition for AR is challenging. Recent advances in edge computing offer opportunities for designing collaborative image recognition frameworks for AR. In this demonstration, we present CollabAR, an edge-assisted collaborative image recognition framework. CollabAR allows AR devices that are facing the same scene to collaborate on the recognition task. Demo participants develop an intuition for different image distortions and their impact on image recognition accuracy. We showcase how heterogeneous images taken by different users can be aggregated to improve recognition accuracy and provide a better user experience in AR.
more »
« less
This content will become publicly available on October 23, 2026
Demonstrating Visual Information Manipulation Attacks in Augmented Reality: A Hands-On Miniature City-Based Setup
Augmented reality (AR) enhances user interaction with the real world but also presents vulnerabilities, particularly through Visual Information Manipulation (VIM) attacks. These attacks alter important real-world visual cues, leading to user confusion and misdirected actions. In this demo, we present a hands-on experience using a miniature city setup, where users interact with manipulated AR content via the Meta Quest 3. The demo highlights the impact of VIM attacks on user decision-making and underscores the need for effective security measures in AR systems. Future work includes a user study and cross-platform testing.
more »
« less
- PAR ID:
- 10647729
- Publisher / Repository:
- ACM
- Date Published:
- Page Range / eLocation ID:
- 501 to 502
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Funt, Brian; Kingsburgh, Robin (Ed.)Optical see-through AR presents virtual objects to a user through a transparent display that blends them with the real-world environment. This is simultaneously novel and familiar: beam splitters have been used for ghostly visual effects, and yet the mechanism is exactly the same as the reflections in an everyday window. The history of theatrical visual effects leads through a series of vision science experiments and now to research on the perception of transparent AR systems. Still, there is a tension in the perception of AR stimuli: users of AR seem to be able to separate, or scission, the layers of virtual and real, depending on their understanding of the scene and its visual characteristics.more » « less
-
In Augmented Reality (AR), virtual content enhances user experience by providing additional information. However, improperly positioned or designed virtual content can be detrimental to task performance, as it can impair users' ability to accurately interpret real-world information. In this paper we examine two types of task-detrimental virtual content: obstruction attacks, in which virtual content prevents users from seeing real-world objects, and information manipulation attacks, in which virtual content interferes with users' ability to accurately interpret real-world information. We provide a mathematical framework to characterize these attacks and create a custom open-source dataset for attack evaluation. To address these attacks, we introduce ViDDAR (Vision language model-based Task-Detrimental content Detector for Augmented Reality), a comprehensive full-reference system that leverages Vision Language Models (VLMs) and advanced deep learning techniques to monitor and evaluate virtual content in AR environments, employing a user-edge-cloud architecture to balance performance with low latency. To the best of our knowledge, ViDDAR is the first system to employ VLMs for detecting task-detrimental content in AR settings. Our evaluation results demonstrate that ViDDAR effectively understands complex scenes and detects task-detrimental content, achieving up to 92.15% obstruction detection accuracy with a detection latency of 533 ms, and an 82.46% information manipulation content detection accuracy with a latency of 9.62 s.more » « less
-
Point-Based Neural Rendering (PBNR) is emerging as a promising class of rendering techniques, which are permeating all aspects of society, driven by a growing demand for real-time, photorealistic rendering in AR/VR and digital twins. Achieving real-time PBNR on mobile devices is challenging. This paper proposes MetaSapiens, a PBNR system that for the first time delivers real-time neural rendering on mobile devices while maintaining human visual quality. MetaSapiens combines three techniques. First, we present an efficiencyaware pruning technique to optimize rendering speed. Second, we introduce a Foveated Rendering (FR) method for PBNR, leveraging humans’ low visual acuity in peripheral regions to relax rendering quality and improve rendering speed. Finally, we propose an accelerator design for FR, addressing the load imbalance issue in (FR-based) PBNR. Our evaluation shows that our system achieves an order of magnitude speedup over existing PBNR models without sacrificing subjective visual quality, as confirmed by a user study. The code and demo are available at: https://horizonlab.org/metasapiens/.more » « less
-
Demand is growing for markerless augmented reality (AR) experiences, but designers of the real-world spaces that host them still have to rely on inexact, qualitative guidelines on the visual environment to try and facilitate accurate pose tracking. Furthermore, the need for visual texture to support markerless AR is often at odds with human aesthetic preferences, and understanding how to balance these competing requirements is challenging due to the siloed nature of the relevant research areas. To address this, we present an integrated design methodology for AR spaces, that incorporates both tracking and human factors into the design process. On the tracking side, we develop the first VI-SLAM evaluation technique that combines the flexibility and control of virtual environments with real inertial data. We use it to perform systematic, quantitative experiments on the effect of visual texture on pose estimation accuracy; through 2000 trials in 20 environments, we reveal the impact of both texture complexity and edge strength. On the human side, we show how virtual reality (VR) can be used to evaluate user satisfaction with environments, and highlight how this can be tailored to AR research and use cases. Finally, we demonstrate our integrated design methodology with a case study on AR museum design, in which we conduct both VI-SLAM evaluations and a VR-based user study of four different museum environments.more » « less
An official website of the United States government
