Mobile Augmented Reality (AR), which overlays digital content on the real-world scenes surrounding a user, is bringing immersive interactive experiences where the real and virtual worlds are tightly coupled. To enable seamless and precise AR experiences, an image recognition system that can accurately recognize the object in the camera view with low system latency is required. However, due to the pervasiveness and severity of image distortions, an effective and robust image recognition solution for mobile AR is still elusive. In this paper, we present CollabAR, an edge-assisted system that provides distortion-tolerant image recognition for mobile AR with imperceptible system latency. CollabAR incorporates both distortion-tolerant and collaborative image recognition modules in its design. The former enables distortion-adaptive image recognition to improve the robustness against image distortions, while the latter exploits the `spatial-temporal' correlation among mobile AR users to improve recognition accuracy. We implement CollabAR on four different commodity devices, and evaluate its performance on two multi-view image datasets. Our evaluation demonstrates that CollabAR achieves over 96% recognition accuracy for images with severe distortions, while reducing the end-to-end system latency to as low as 17.8ms for commodity mobile devices.
more »
« less
Edge-assisted Collaborative Image Recognition for Mobile Augmented Reality
Mobile Augmented Reality (AR), which overlays digital content on the real-world scenes surrounding a user, is bringing immersive interactive experiences where the real and virtual worlds are tightly coupled. To enable seamless and precise AR experiences, an image recognition system that can accurately recognize the object in the camera view with low system latency is required. However, due to the pervasiveness and severity of image distortions, an effective and robust image recognition solution for “in the wild” mobile AR is still elusive. In this article, we present CollabAR, an edge-assisted system that provides distortion-tolerant image recognition for mobile AR with imperceptible system latency. CollabAR incorporates both distortion-tolerant and collaborative image recognition modules in its design. The former enables distortion-adaptive image recognition to improve the robustness against image distortions, while the latter exploits the spatial-temporal correlation among mobile AR users to improve recognition accuracy. Moreover, as it is difficult to collect a large-scale image distortion dataset, we propose a Cycle-Consistent Generative Adversarial Network-based data augmentation method to synthesize realistic image distortion. Our evaluation demonstrates that CollabAR achieves over 85% recognition accuracy for “in the wild” images with severe distortions, while reducing the end-to-end system latency to as low as 18.2 ms.
more »
« less
- NSF-PAR ID:
- 10315355
- Date Published:
- Journal Name:
- ACM transactions on sensor networks
- Volume:
- 18
- Issue:
- 1
- ISSN:
- 1550-4859
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Mobile Augmented Reality (AR), which overlays digital information with real-world scenes surrounding a user, provides an enhanced mode of interaction with the ambient world. Contextual AR applications rely on image recognition to identify objects in the view of the mobile device. In practice, due to image distortions and device resource constraints, achieving high performance image recognition for AR is challenging. Recent advances in edge computing offer opportunities for designing collaborative image recognition frameworks for AR. In this demonstration, we present CollabAR, an edge-assisted collaborative image recognition framework. CollabAR allows AR devices that are facing the same scene to collaborate on the recognition task. Demo participants develop an intuition for different image distortions and their impact on image recognition accuracy. We showcase how heterogeneous images taken by different users can be aggregated to improve recognition accuracy and provide a better user experience in AR.more » « less
-
Augmented reality is an emerging application on mobile devices. However, there is a lack of understanding of the communication requirements and challenges of multi-user AR scenarios. In this position paper, we propose several important research issues that need to be addressed for low-latency, accurate shared AR experiences: (a) Systems tradeoffs of AR communication architectures used today in mobile AR platforms; (b) Understanding AR communication patterns and adapting the AR application layer to dynamically changing network conditions; and (c) Tools and methodologies to evaluate AR quality of experience in real time on mobile devices. We present preliminary measurements of off-the-shelf mobile AR platforms as well as results from our AR system, ShareAR, illustrating performance tradeoffs and indicating promising new research directions.more » « less
-
null (Ed.)Mobile Augmented Reality (AR) provides immersive experiences by aligning virtual content (holograms) with a view of the real world. When a user places a hologram it is usually expected that like a real object, it remains in the same place. However, positional errors frequently occur due to inaccurate environment mapping and device localization, to a large extent determined by the properties of natural visual features in the scene. In this demonstration we present SceneIt, the first visual environment rating system for mobile AR based on predictions of hologram positional error magnitude. SceneIt allows users to determine if virtual content placed in their environment will drift noticeably out of position, without requiring them to place that content. It shows that the severity of positional error for a given visual environment is predictable, and that this prediction can be calculated with sufficiently high accuracy and low latency to be useful in mobile AR applications.more » « less
-
Mobile headsets should be capable of understanding 3D physical environments to offer a truly immersive experience for augmented/mixed reality (AR/MR). However, their small form-factor and limited computation resources make it extremely challenging to execute in real-time 3D vision algorithms, which are known to be more compute-intensive than their 2D counterparts. In this paper, we propose DeepMix, a mobility-aware, lightweight, and hybrid 3D object detection framework for improving the user experience of AR/MR on mobile headsets. Motivated by our analysis and evaluation of state-of-the-art 3D object detection models, DeepMix intelligently combines edge-assisted 2D object detection and novel, on-device 3D bounding box estimations that leverage depth data captured by headsets. This leads to low end-to-end latency and significantly boosts detection accuracy in mobile scenarios. A unique feature of DeepMix is that it fully exploits the mobility of headsets to fine-tune detection results and boost detection accuracy. To the best of our knowledge, DeepMix is the first 3D object detection that achieves 30 FPS (i.e., an end-to-end latency much lower than the 100 ms stringent requirement of interactive AR/MR). We implement a prototype of DeepMix on Microsoft HoloLens and evaluate its performance via both extensive controlled experiments and a user study with 30+ participants. DeepMix not only improves detection accuracy by 9.1--37.3% but also reduces end-to-end latency by 2.68--9.15×, compared to the baseline that uses existing 3D object detection models.more » « less