Mobile Augmented Reality (AR), which overlays digital content on the real-world scenes surrounding a user, is bringing immersive interactive experiences where the real and virtual worlds are tightly coupled. To enable seamless and precise AR experiences, an image recognition system that can accurately recognize the object in the camera view with low system latency is required. However, due to the pervasiveness and severity of image distortions, an effective and robust image recognition solution for “in the wild” mobile AR is still elusive. In this article, we present CollabAR, an edge-assisted system that provides distortion-tolerant image recognition for mobile AR with imperceptible system latency. CollabAR incorporates both distortion-tolerant and collaborative image recognition modules in its design. The former enables distortion-adaptive image recognition to improve the robustness against image distortions, while the latter exploits the spatial-temporal correlation among mobile AR users to improve recognition accuracy. Moreover, as it is difficult to collect a large-scale image distortion dataset, we propose a Cycle-Consistent Generative Adversarial Network-based data augmentation method to synthesize realistic image distortion. Our evaluation demonstrates that CollabAR achieves over 85% recognition accuracy for “in the wild” images with severe distortions, while reducing the end-to-end system latency to as low as 18.2 ms.
more »
« less
DeepMix: mobility-aware, lightweight, and hybrid 3D object detection for headsets
Mobile headsets should be capable of understanding 3D physical environments to offer a truly immersive experience for augmented/mixed reality (AR/MR). However, their small form-factor and limited computation resources make it extremely challenging to execute in real-time 3D vision algorithms, which are known to be more compute-intensive than their 2D counterparts. In this paper, we propose DeepMix, a mobility-aware, lightweight, and hybrid 3D object detection framework for improving the user experience of AR/MR on mobile headsets. Motivated by our analysis and evaluation of state-of-the-art 3D object detection models, DeepMix intelligently combines edge-assisted 2D object detection and novel, on-device 3D bounding box estimations that leverage depth data captured by headsets. This leads to low end-to-end latency and significantly boosts detection accuracy in mobile scenarios. A unique feature of DeepMix is that it fully exploits the mobility of headsets to fine-tune detection results and boost detection accuracy. To the best of our knowledge, DeepMix is the first 3D object detection that achieves 30 FPS (i.e., an end-to-end latency much lower than the 100 ms stringent requirement of interactive AR/MR). We implement a prototype of DeepMix on Microsoft HoloLens and evaluate its performance via both extensive controlled experiments and a user study with 30+ participants. DeepMix not only improves detection accuracy by 9.1--37.3% but also reduces end-to-end latency by 2.68--9.15×, compared to the baseline that uses existing 3D object detection models.
more »
« less
- PAR ID:
- 10358832
- Date Published:
- Journal Name:
- the 20th Annual International Conference on Mobile Systems, Applications and Services
- Page Range / eLocation ID:
- 28 to 41
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Mobile Augmented Reality (AR), which overlays digital content on the real-world scenes surrounding a user, is bringing immersive interactive experiences where the real and virtual worlds are tightly coupled. To enable seamless and precise AR experiences, an image recognition system that can accurately recognize the object in the camera view with low system latency is required. However, due to the pervasiveness and severity of image distortions, an effective and robust image recognition solution for mobile AR is still elusive. In this paper, we present CollabAR, an edge-assisted system that provides distortion-tolerant image recognition for mobile AR with imperceptible system latency. CollabAR incorporates both distortion-tolerant and collaborative image recognition modules in its design. The former enables distortion-adaptive image recognition to improve the robustness against image distortions, while the latter exploits the `spatial-temporal' correlation among mobile AR users to improve recognition accuracy. We implement CollabAR on four different commodity devices, and evaluate its performance on two multi-view image datasets. Our evaluation demonstrates that CollabAR achieves over 96% recognition accuracy for images with severe distortions, while reducing the end-to-end system latency to as low as 17.8ms for commodity mobile devices.more » « less
-
Locating RFID-tagged items in the environment and guiding humans to retrieve the tagged items is an important problem in the RFID community. This paper explores how to exploit synergies between Augmented Reality (AR) headsets and RFID localization to help solve this problem by improving both user experience and localization accuracy. Using fundamental mathematical formulations for RFID localization, we derive confidence metrics and display guidance to the user to improve their experience and enable them to retrieve items faster. We build our primitives into an end - to-end system, RF - AR, and show that it achieves 8.6 cm median localization accuracy within 76 seconds and enables 55% faster retrieval than state-of-the-art past systems. Our results demonstrate that AR-based “human-in-the-loop” designs can make the localization task more accurate and efficient, and thus holds the potential to improve processes where items need to be retrieved quickly, such as in manufacturing, retail, and warehousing.more » « less
-
Augmented Reality (AR) has been widely hailed as a representative of ultra-high bandwidth and ultra-low latency apps that will be enabled by 5G networks. While single-user AR can perform AR tasks locally on the mobile device, multi-user AR apps, which allow multiple users to interact within the same physical space, critically rely on the cellular network to support user interactions. However, a recent study showed that multi-user AR apps can experience very high end-to-end latency when running over LTE, rendering user interaction practically infeasible. In this paper, we study whether 5G mmWave, which promises significant bandwidth and latency improvements over LTE, can support multi-user AR by conducting an in-depth measurement study of the same popular multi-user AR app over both LTE and 5G mmWave. Our measurement and analysis show that: (1) The E2E AR latency over LTE is significantly lower compared to the values reported in the previous study. However, it still remains too high for practical user interaction. (2) 5G mmWave brings no benefits to multi-user AR apps. (3) While 5G mmWave reduces the latency of the uplink visual data transmission, there are other components of the AR app that are independent of the network technology and account for a significant fraction of the E2E latency. (4) The app drains 66% more network energy, which translates to 28% higher total energy over 5G mmWave compared to over LTE.more » « less
-
null (Ed.)Mobile Augmented Reality (AR) provides immersive experiences by aligning virtual content (holograms) with a view of the real world. When a user places a hologram it is usually expected that like a real object, it remains in the same place. However, positional errors frequently occur due to inaccurate environment mapping and device localization, to a large extent determined by the properties of natural visual features in the scene. In this demonstration we present SceneIt, the first visual environment rating system for mobile AR based on predictions of hologram positional error magnitude. SceneIt allows users to determine if virtual content placed in their environment will drift noticeably out of position, without requiring them to place that content. It shows that the severity of positional error for a given visual environment is predictable, and that this prediction can be calculated with sufficiently high accuracy and low latency to be useful in mobile AR applications.more » « less