skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, May 17 until 8:00 AM ET on Saturday, May 18 due to maintenance. We apologize for the inconvenience.


Title: Synthesizing scene-aware virtual reality teleport graphs
We present a novel approach for synthesizing scene-aware virtual reality teleport graphs, which facilitate navigation in indoor virtual environments by suggesting desirable teleport positions. Our approach analyzes panoramic views at candidate teleport positions by extracting scene perception graphs, which encode scene perception relationships between the observer and the surrounding objects, and predict how desirable the views at these positions are. We train a graph convolutional model to predict the scene perception scores of different teleport positions. Based on such predictions, we apply an optimization approach to sample a set of desirable teleport positions while considering other navigation properties such as coverage and connectivity to synthesize a teleport graph. Using teleport graphs, users can navigate virtual environments efficaciously. We demonstrate our approach for synthesizing teleport graphs for common indoor scenes. By conducting a user study, we validate the efficacy and desirability of navigating virtual environments via the synthesized teleport graphs. We also extend our approach to cope with different constraints, user preferences, and practical scenarios.  more » « less
Award ID(s):
1942531
NSF-PAR ID:
10321074
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
ACM Transactions on Graphics
Volume:
40
Issue:
6
ISSN:
0730-0301
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Visual scenes are often remembered as if they were observed from a different viewpoint. Some scenes are remembered as farther than they appeared, and others as closer. These memory distortions—also known as boundary extension and contraction—are strikingly consistent for a given scene, but their cause remains unknown. We tested whether these distortions can be explained by an inferential process that adjusts scene memories toward high-probability views, using viewing depth as a test case. We first carried out a large-scale analysis of depth maps of natural indoor scenes to quantify the statistical probability of views in depth. We then assessed human observers’ memory for these scenes at various depths and found that viewpoint judgments were consistently biased toward the modal depth, even when just a few seconds elapsed between viewing and reporting. Thus, scenes closer than the modal depth showed a boundary-extension bias (remembered as farther-away), and scenes farther than the modal depth showed a boundary-contraction bias (remembered as closer). By contrast, scenes at the modal depth did not elicit a consistent bias in either direction. This same pattern of results was observed in a follow-up experiment using tightly controlled stimuli from virtual environments. Together, these findings show that scene memories are biased toward statistically probable views, which may serve to increase the accuracy of noisy or incomplete scene representations. 
    more » « less
  2. Placing and orienting a camera to compose aesthetically meaningful shots of a scene is not only a key objective in real-world photography and cinematography but also for virtual content creation. The framing of a camera often significantly contributes to the story telling in movies, games, and mixed reality applications. Generating single camera poses or even contiguous trajectories either requires a significant amount of manual labor or requires solving highdimensional optimization problems, which can be computationally demanding and error-prone. In this paper, we introduce GAIT, a Deep Reinforcement Learning (DRL) agent, that learns to automatically control a camera to generate a sequence of aesthetically meaningful views for synthetic 3D indoor scenes. To generate sequences of frames with high aesthetic value, GAIT relies on a neural aesthetics estimator, which is trained on a crowed-sourced dataset. Additionally, we introduce regularization techniques for diversity and smoothness to generate visually interesting trajectories for a 3D environment, and to constrain agent acceleration in the reward function to generate a smooth sequence of camera frames. We validated our method by comparing it to baseline algorithms, based on a perceptual user study, and through ablation studies. The source code of our method will be released with the final version of our paper. 
    more » « less
  3. Agaian, Sos S. ; DelMarco, Stephen P. ; Asari, Vijayan K. (Ed.)
    High accuracy localization and user positioning tracking is critical in improving the quality of augmented reality environments. The biggest challenge facing developers is localizing the user based on visible surroundings. Current solutions rely on the Global Positioning System (GPS) for tracking and orientation. However, GPS receivers have an accuracy of about 10 to 30 meters, which is not accurate enough for augmented reality, which needs precision measured in millimeters or smaller. This paper describes the development and demonstration of a head-worn augmented reality (AR) based vision-aid indoor navigation system, which localizes the user without relying on a GPS signal. Commercially available augmented reality head-set allows individuals to capture the field of vision using the front-facing camera in a real-time manner. Utilizing captured image features as navigation-related landmarks allow localizing the user in the absence of a GPS signal. The proposed method involves three steps: a detailed front-scene camera data is collected and generated for landmark recognition; detecting and locating an individual’s current position using feature matching, and display arrows to indicate areas that require more data collects if needed. Computer simulations indicate that the proposed augmented reality-based vision-aid indoor navigation system can provide precise simultaneous localization and mapping in a GPS-denied environment. Keywords: Augmented-reality, navigation, GPS, HoloLens, vision, positioning system, localization 
    more » « less
  4. Abstract—Materials Genomics initiative has the goal of rapidly synthesizing materials with a given set of desired properties using data science techniques. An important step in this direction is the ability to predict the outcomes of complex chemical reactions. Some graph-based feature learning algorithms have been proposed recently. However, the comprehensive relationship between atoms or structures is not learned properly and not explainable, and multiple graphs cannot be handled. In this paper, chemical reaction processes are formulated as translation processes. Both atoms and edges are mapped to vectors represent- ing the structural information. We employ the graph convolution layers to learn meaningful information of atom graphs, and further employ its variations, message passing networks (MPNN) and edge attention graph convolution network (EAGCN) to learn edge representations. Particularly, multi-view EAGCN groups and maps edges to a set of representations for the properties of the chemical bond between atoms from multiple views. Each bond is viewed from its atom type, bond type, distance and neighbor environment. The final node and edge representations are mapped to a sequence defined by the SMILES of the molecule and then fed to a decoder model with attention. To make full usage of multi-view information, we propose multi-view attention model to handle self correlation inside each atom or edge, and mutual correlation between edges and atoms, both of which are important in chemical reaction processes. We have evaluated our method on the standard benchmark datasets (that have been used by all the prior works), and the results show that edge embedding with multi-view attention achieves superior accuracy compared to existing techniques. 
    more » « less
  5. Despite the phenomenal advances in the computational power and functionality of electronic systems, human-machine interaction has largely been limited to simple control panels, keyboard, mouse and display. Consequently, these systems either rely critically on close human guidance or operate almost independently from the user. An exemplar technology integrated tightly into our lives is the smartphone. However, the term “smart” is a misnomer, since it has fundamentally no intelligence to understand its user. The users still have to type, touch or speak (to some extent) to express their intentions in a form accessible to the phone. Hence, intelligent decision making is still almost entirely a human task. A life-changing experience can be achieved by transforming machines from passive tools to agents capable of understanding human physiology and what their user wants [1]. This can advance human capabilities in unimagined ways by building a symbiotic relationship to solve real world problems cooperatively. One of the high-impact application areas of this approach is assistive internet of things (IoT) technologies for physically challenged individuals. The Annual World Report on Disability reveals that 15% of the world population lives with disability, while 110 to 190 million of these people have difficulty in functioning [1]. Quality of life for this population can improve significantly if we can provide accessibility to smart devices, which provide sensory inputs and assist with everyday tasks. This work demonstrates that smart IoT devices open up the possibility to alleviate the burden on the user by equipping everyday objects, such as a wheelchair, with decision-making capabilities. Moving part of the intelligent decision making to smart IoT objects requires a robust mechanism for human-machine communication (HMC). To address this challenge, we present examples of multimodal HMC mechanisms, where the modalities are electroencephalogram (EEG), speech commands, and motion sensing. We also introduce an IoT co-simulation framework developed using a network simulator (OMNeT++) and a robot simulation platform Virtual Robot Experimentation Platform (V-REP). We show how this framework is used to evaluate the effectiveness of different HMC strategies using automated indoor navigation as a driver application. 
    more » « less