LOBSTAR: Language Model-based Obstruction Detection for Augmented Reality

Xiu, Y; Scargill, T; Gorlatova, M

Citation Details

In Augmented Reality (AR), improper virtual content placement can obstruct real-world elements, causing confusion and degrading the experience. To address this, we present LOBSTAR (Language model-based OBSTruction detection for Augmented Reality), the first system leveraging a vision language model (VLM) to detect key objects and prevent obstructions in AR. We evaluated LOBSTAR using both real-world and virtual-scene images and developed a mobile app for AR content obstruction detection. Our results demonstrate that LOBSTAR effectively understands scenes and detects obstructive content with well-designed VLM prompts, achieving up to 96% accuracy and a detection latency of 580ms on a mobile app. more »

Award ID(s):: 2231975 2046072 2312760

PAR ID:: 10553271

Author(s) / Creator(s):: Xiu, Y; Scargill, T; Gorlatova, M

Publisher / Repository:: IEEE ISMAR 2024

Date Published:: 2024-10-14

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this