skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Ultra-Sparse 360-Degree Camera View Synthesis for Immersive Virtual Tourism
360-degree video based virtual tours are becoming more and more popular due to travel costs and restrictions. Existing solutions leverage teleport, 3D modeling or image morphing, but none of them offers satisfactory immersion and scalability. In this paper, we propose a morphing based ultra-sparse 360-degree camera virtual tourism solution. It uses a novel bus tour mode to improve immersion; besides, it uses a series of strategies to improve feature matching such that morphing works well for ultra-sparse (15 m apart) cameras and the system can be deployed on a large scale. The experimental results show that our work results in remarkably better feature matching and synthesized views.  more » « less
Award ID(s):
1900875
PAR ID:
10438195
Author(s) / Creator(s):
;
Date Published:
Journal Name:
2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval (MIPR)
Page Range / eLocation ID:
281 to 286
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Immersive virtual tours based on 360-degree cameras, showing famous outdoor scenery, are becoming more and more desirable due to travel costs, pandemics and other constraints. To feel immersive, a user must receive the view accurately corresponding to her position and orientation in the virtual space when she moves inside, and this requires cameras’ orientations to be known. Outdoor tour contexts have numerous, ultra-sparse cameras deployed across a wide area, making camera pose estimation challenging. As a result, pose estimation techniques like SLAM, which require mobile or dense cameras, are not applicable. In this paper we present a novel strategy called 360ViewPET, which automatically estimates the relative poses of two stationary, ultra-sparse (15 meters apart) 360-degree cameras using one equirectangular image taken by each camera. Our experiments show that it achieves accurate pose estimation, with a mean error as low as 0.9 degree 
    more » « less
  2. null (Ed.)
    We investigate a novel communications system that integrates scalable multi-layer 360-degree video tiling, viewport-adaptive rate-distortion optimal resource allocation, and VR-centric edge computing and caching, to enable future high-quality untethered VR streaming. Our system comprises a collection of 5G small cells that can pool their communication, computing, and storage resources to collectively deliver scalable 360-degree video content to mobile VR clients at much higher quality. Our major contributions are rigorous design of multi-layer 360-degree tiling and related models of statistical user navigation, and analysis and optimization of edge-based multi-user VR streaming that integrates viewport adaptation and server cooperation. We also explore the possibility of network coded data operation and its implications for the analysis, optimization, and system performance we pursue here. We demonstrate considerable gains in delivered immersion fidelity, featuring much higher 360-degree viewport peak signal to noise ratio (PSNR) and VR video frame rates and spatial resolutions. 
    more » « less
  3. Etessami, Kousha; Feige, Uriel; Puppis, Gabriele (Ed.)
    Motivated by recent progress on stochastic matching with few queries, we embark on a systematic study of the sparsification of stochastic packing problems more generally. Specifically, we consider packing problems where elements are independently active with a given probability p, and ask whether one can (non-adaptively) compute a "sparse" set of elements guaranteed to contain an approximately optimal solution to the realized (active) subproblem. We seek structural and algorithmic results of broad applicability to such problems. Our focus is on computing sparse sets containing on the order of d feasible solutions to the packing problem, where d is linear or at most polynomial in 1/p. Crucially, we require d to be independent of the number of elements, or any parameter related to the "size" of the packing problem. We refer to d as the "degree" of the sparsifier, as is consistent with graph theoretic degree in the special case of matching. First, we exhibit a generic sparsifier of degree 1/p based on contention resolution. This sparsifier’s approximation ratio matches the best contention resolution scheme (CRS) for any packing problem for additive objectives, and approximately matches the best monotone CRS for submodular objectives. Second, we embark on outperforming this generic sparsifier for additive optimization over matroids and their intersections, as well as weighted matching. These improved sparsifiers feature different algorithmic and analytic approaches, and have degree linear in 1/p. In the case of a single matroid, our sparsifier tends to the optimal solution. In the case of weighted matching, we combine our contention-resolution-based sparsifier with technical approaches of prior work to improve the state of the art ratio from 0.501 to 0.536. Third, we examine packing problems with submodular objectives. We show that even the simplest such problems do not admit sparsifiers approaching optimality. We then outperform our generic sparsifier for some special cases with submodular objectives. 
    more » « less
  4. null (Ed.)
    Future view prediction for a 360-degree video streaming system is important to save the network bandwidth and improve the Quality of Experience (QoE). Historical view data of a single viewer and multiple viewers have been used for future view prediction. Video semantic information is also useful to predict the viewer's future behavior. However, extracting video semantic information requires powerful computing hardware and large memory space to perform deep learning-based video analysis. It is not a desirable condition for most of client devices, such as small mobile devices or Head Mounted Display (HMD). Therefore, we develop an approach where video semantic analysis is executed on the media server, and the analysis results are shared with clients via the Semantic Flow Descriptor (SFD) and View-Object State Machine (VOSM). SFD and VOSM become new descriptive additions of the Media Presentation Description (MPD) and Spatial Relation Description (SRD) to support 360-degree video streaming. Using the semantic-based approach, we design the Semantic-Aware View Prediction System (SEAWARE) to improve the overall view prediction performance. The evaluation results of 360-degree videos and real HMD view traces show that the SEAWARE system improves the view prediction performance and streams high-quality video with limited network bandwidth. 
    more » « less
  5. As virtual reality (VR) offers an unprecedented experience than any existing multimedia technologies, VR videos, or called 360-degree videos, have attracted considerable attention from academia and industry. How to quantify and model end users' perceived quality in watching 360-degree videos, or called QoE, resides the center for high-quality provisioning of these multimedia services. In this work, we present EyeQoE, a novel QoE assessment model for 360-degree videos using ocular behaviors. Unlike prior approaches, which mostly rely on objective factors, EyeQoE leverages the new ocular sensing modality to comprehensively capture both subjective and objective impact factors for QoE modeling. We propose a novel method that models eye-based cues into graphs and develop a GCN-based classifier to produce QoE assessment by extracting intrinsic features from graph-structured data. We further exploit the Siamese network to eliminate the impact from subjects and visual stimuli heterogeneity. A domain adaptation scheme named MADA is also devised to generalize our model to a vast range of unseen 360-degree videos. Extensive tests are carried out with our collected dataset. Results show that EyeQoE achieves the best prediction accuracy at 92.9%, which outperforms state-of-the-art approaches. As another contribution of this work, we have publicized our dataset on https://github.com/MobiSec-CSE-UTA/EyeQoE_Dataset.git. 
    more » « less