Vision Language Model-Based Solution for Obstruction Attack in AR: A Meta Quest 3 Implementation

Xiu, Yanming; Gorlatova, Maria

doi:10.1109/VRW66409.2025.00464

Citation Details

This content will become publicly available on March 8, 2026

Vision Language Model-Based Solution for Obstruction Attack in AR: A Meta Quest 3 Implementation

Obstruction attacks in Augmented Reality (AR) pose significant challenges by obscuring critical real-world objects. This work demonstrates the first implementation of obstruction detection on a video see-through head-mounted display (HMD), the Meta Quest 3. Leveraging a vision language models (VLM) and a multi-modal object detection model, our system detects obstructions by analyzing both raw and augmented images. Due to limited access to raw camera feeds, the system employs an image-capturing approach using Oculus casting, capturing a sequence of images and finding the raw image from them. Our implementation showcases the feasibility of effective obstruction detection in AR environments and highlights future opportunities for improving real-time detection through enhanced camera access. more »

Award ID(s):: 2231975 2312760 2046072

PAR ID:: 10647728

Author(s) / Creator(s):: Xiu, Yanming ; Gorlatova, Maria

Publisher / Repository:: IEEE VR 2025

Date Published:: 2025-03-08

Page Range / eLocation ID:: 1638 to 1639

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on March 8, 2026
Conference Paper:
https://doi.org/10.1109/VRW66409.2025.00464

More Like this