Immersive virtual tours based on 360-degree cameras, showing famous outdoor scenery, are becoming more and more desirable due to travel costs, pandemics and other constraints. To feel immersive, a user must receive the view accurately corresponding to her position and orientation in the virtual space when she moves inside, and this requires cameras’ orientations to be known. Outdoor tour contexts have numerous, ultra-sparse cameras deployed across a wide area, making camera pose estimation challenging. As a result, pose estimation techniques like SLAM, which require mobile or dense cameras, are not applicable. In this paper we present a novel strategy called 360ViewPET, which automatically estimates the relative poses of two stationary, ultra-sparse (15 meters apart) 360-degree cameras using one equirectangular image taken by each camera. Our experiments show that it achieves accurate pose estimation, with a mean error as low as 0.9 degree
more »
« less
Ultra-Sparse 360-Degree Camera View Synthesis for Immersive Virtual Tourism
360-degree video based virtual tours are becoming more and more popular due to travel costs and restrictions. Existing solutions leverage teleport, 3D modeling or image morphing, but none of them offers satisfactory immersion and scalability. In this paper, we propose a morphing based ultra-sparse 360-degree camera virtual tourism solution. It uses a novel bus tour mode to improve immersion; besides, it uses a series of strategies to improve feature matching such that morphing works well for ultra-sparse (15 m apart) cameras and the system can be deployed on a large scale. The experimental results show that our work results in remarkably better feature matching and synthesized views.
more »
« less
- Award ID(s):
- 1900875
- PAR ID:
- 10438195
- Date Published:
- Journal Name:
- 2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval (MIPR)
- Page Range / eLocation ID:
- 281 to 286
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)We investigate a novel communications system that integrates scalable multi-layer 360-degree video tiling, viewport-adaptive rate-distortion optimal resource allocation, and VR-centric edge computing and caching, to enable future high-quality untethered VR streaming. Our system comprises a collection of 5G small cells that can pool their communication, computing, and storage resources to collectively deliver scalable 360-degree video content to mobile VR clients at much higher quality. Our major contributions are rigorous design of multi-layer 360-degree tiling and related models of statistical user navigation, and analysis and optimization of edge-based multi-user VR streaming that integrates viewport adaptation and server cooperation. We also explore the possibility of network coded data operation and its implications for the analysis, optimization, and system performance we pursue here. We demonstrate considerable gains in delivered immersion fidelity, featuring much higher 360-degree viewport peak signal to noise ratio (PSNR) and VR video frame rates and spatial resolutions.more » « less
-
Etessami, Kousha; Feige, Uriel; Puppis, Gabriele (Ed.)Motivated by recent progress on stochastic matching with few queries, we embark on a systematic study of the sparsification of stochastic packing problems more generally. Specifically, we consider packing problems where elements are independently active with a given probability p, and ask whether one can (non-adaptively) compute a "sparse" set of elements guaranteed to contain an approximately optimal solution to the realized (active) subproblem. We seek structural and algorithmic results of broad applicability to such problems. Our focus is on computing sparse sets containing on the order of d feasible solutions to the packing problem, where d is linear or at most polynomial in 1/p. Crucially, we require d to be independent of the number of elements, or any parameter related to the "size" of the packing problem. We refer to d as the "degree" of the sparsifier, as is consistent with graph theoretic degree in the special case of matching. First, we exhibit a generic sparsifier of degree 1/p based on contention resolution. This sparsifier’s approximation ratio matches the best contention resolution scheme (CRS) for any packing problem for additive objectives, and approximately matches the best monotone CRS for submodular objectives. Second, we embark on outperforming this generic sparsifier for additive optimization over matroids and their intersections, as well as weighted matching. These improved sparsifiers feature different algorithmic and analytic approaches, and have degree linear in 1/p. In the case of a single matroid, our sparsifier tends to the optimal solution. In the case of weighted matching, we combine our contention-resolution-based sparsifier with technical approaches of prior work to improve the state of the art ratio from 0.501 to 0.536. Third, we examine packing problems with submodular objectives. We show that even the simplest such problems do not admit sparsifiers approaching optimality. We then outperform our generic sparsifier for some special cases with submodular objectives.more » « less
-
Recent research has used virtual environments (VEs), as presented via virtual reality (VR) headsets, to study human behavior in hypothetical fire scenarios. One goal of using VEs in fire scenarios is to elicit patterns of behavior which more closely align to how individuals would react to real fire emergency situations. The present study investigated whether elicited behaviors and perceived risk varied during fire scenarios presented as VEs via two viewing conditions. These included a VR condition, where the VE was rendered as 360-degree videos presented in a VR headset, and a screen condition, where VEs were rendered as fixed-view videos via a computer monitor screen. We predicted that the selection of actions during the scenario would vary between conditions, that participants would rate fires as more dangerous if they developed more quickly and when smoke was rendered as thicker, and that participants would report greater levels of immersion in the VR condition. A total of 159 participants completed a decision-making task where they viewed videos of an incipient fire in a residential building and judged what action to take. Initial action responses to the fire scenarios varied between both viewing and smoke conditions, with those assigned to the thicker smoke and screen conditions being more likely to take protective action. Risk ratings also varied by smoke condition, with evidence of higher perceived risk for thicker smoke. Several factors of self-reported immersion (namely ‘interest’, ‘emotional attachment’, ‘focus of attention’, and ‘flow’) were associated with risk ratings, with perceived presence associated with initial actions. The present study provides evidence that enhancing immersion and perceived risk in a VE contributes to a different pattern of behaviors during simulated fire decision-making tasks. While our investigation only addressed the ideas of presence in an environment, future research should investigate the relative contribution of interactivity and consequences within the environment to further identify how behaviors during simulated fire scenarios are affected by each of these factors.more » « less
-
Benjamin, Paaßen; Carrie, Demmans Epp (Ed.)Multimodal Learning Analytics (MMLA) has emerged as a powerful approach within the computer-supported collaborative learning community, offering nuanced insights into learning processes through diverse data sources. Despite its potential, the prevalent reliance on traditional instruments such as tripod-mounted digital cameras for video capture often results in sub optimal data quality for facial expressions captured, which is crucial for understanding collaborative dynamics. This study introduces an innovative approach to overcome this limitation by employing 360-degree camera technology to capture students' facial features while collaborating in small working groups. A comparative analysis of 1.5 hours of video data from both traditional tripod-mounted digital cameras and 360-degree cameras evaluated the efficacy of these methods in capturing Facial Action Units (AU) and facial keypoints. The use of OpenFace revealed that the 360-degree camera captured high-quality facial features in 33.17\% of frames, significantly outperforming the traditional method's 8.34\%, thereby enhancing reliability in facial feature detection. The findings suggest a pathway for future research to integrate 360-degree camera technology in MMLA. Future research directions involve refining this technology further to improve the detection of affective states in collaborative learning environments, thereby offering a richer understanding of the learning process.more » « less
An official website of the United States government

