skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Effect and influence of ambisonic audio in viewing 360 video
Research has provided evidence of the value of producing multiple representations of content for learners (e.g., verbal, visual, etc.). However, much of the research has acknowledged changes in visual technologies while not recognizing or utilizing related audio innovations. For instance, teacher education students who were once taught through two-dimensional video are now being presented with interactive, three-dimensional content (e.g., simulations or 360 video). Users in old and new formats, however, still typically receive monophonic sound. A limited number of research studies exist that have examined the impact of combining three-dimensional sound to match three-dimensional video in learning environments. The purpose of this study was to respond to this gap by comparing the outcomes of watching 360 video with either monophonic or ambisonic audio. Results provided evidence that ambisonic audio increased perceived presence for those familiar with the content being taught, led to differentiation in what ambisonic viewers noticed compared to monophonic groups, and improved participant focus in watching the 360 video. Implications for the development and implementation into virtual worlds are discussed.  more » « less
Award ID(s):
1908159
PAR ID:
10274524
Author(s) / Creator(s):
Date Published:
Journal Name:
Journal of virtual worlds research
Volume:
13
Issue:
2-3
ISSN:
1941-8477
Page Range / eLocation ID:
1-14
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The use of video is commonplace for professional preparation in education and other fields. Research has provided evidence that the use of video in these contexts can lead to increased noticing and reflection. However, educators now have access to evolving forms of video such as 360 video. The purpose of this study was to adapt and validate an instrument for assessing immersive 360 video use in an undergraduate preservice teacher university training program. Data provided evidence of the validity of the Extended Reality Presence Scale (XRPS) for 360 video research in preservice teacher professional development. Moreover, evidence from the study suggests that those with higher feelings of presence are less likely to jump around (or twitch) while watching 360 videos. The main implications are that: a) the XRPS is a validated and reliable instrument and b) more research is needed to examine the presence and practices for in-service and preservice teachers while watching 360 video. 
    more » « less
  2. null (Ed.)
    One of the most disruptive aspects of 2020 for teacher education, mainly due to COVID, was the loss of field placements for future teachers. Teacher educators attempted to respond to this gap with videos of exemplary practice—something used commonly in teacher education to supplement such field experiences. Teacher educators, however, should have learned about the potential and promise for the use of 360 video for teaching and teacher education. This chapter highlights the research behind the use of 360, also showcasing how it has been used successfully in mathematics teacher education and physical education teacher education. The chapter includes evidence supporting the use of 360 as a dissemination technique and a technology skill needed to be taught to current and future teachers. Finally, evidence is provided to suggest that the use of 360 should be continued even when field placements return fully face-to-face. 
    more » « less
  3. Generating realistic audio for human actions is critical for applications such as film sound effects and virtual reality games. Existing methods assume complete correspondence between video and audio during training, but in real-world settings, many sounds occur off-screen or weakly correspond to visuals, leading to uncontrolled ambient sounds or hallucinations at test time. This paper introduces AV-LDM, a novel ambient-aware audio generation model that disentangles foreground action sounds from ambient background noise in in-the-wild training videos. The approach leverages a retrieval-augmented generation framework to synthesize audio that aligns both semantically and temporally with the visual input. Trained and evaluated on Ego4D and EPIC-KITCHENS datasets, along with the newly introduced Ego4D-Sounds dataset (1.2M curated clips with action-audio correspondence), the model outperforms prior methods, enables controllable ambient sound generation, and shows promise for generalization to synthetic video game clips. This work is the first to emphasize faithful video-to-audio generation focused on observed visual content despite noisy, uncurated training data. 
    more » « less
  4. The joint analysis of audio and video is a powerful tool that can be applied to various contexts, including action, speech, and sound recognition, audio-visual video parsing, emotion recognition in affective computing, and self-supervised training of deep learning models. Solving these problems often involves tackling core audio-visual tasks, such as audio-visual source localization, audio-visual correspondence, and audio-visual source separation, which can be combined in various ways to achieve the desired results. This paper provides a review of the literature in this area, discussing the advancements, history, and datasets of audio-visual learning methods for various application domains. It also presents an overview of the reported performances on standard datasets and suggests promising directions for future research. 
    more » « less
  5. Intelligent Virtual Agents (IVAs) received enormous attention in recent years due to significant improvements in voice communication technologies and the convergence of different research fields such as Machine Learning, Internet of Things, and Virtual Reality (VR). Interactive conversational IVAs can appear in different forms such as voice-only or with embodied audio-visual representations showing, for example, human-like contextually related or generic three-dimensional bodies. In this paper, we analyzed the benefits of different forms of virtual agents in the context of a VR exhibition space. Our results suggest positive evidence showing large benefits of both embodied and thematically related audio-visual representations of IVAs. We discuss implications and suggestions for content developers to design believable virtual agents in the context of such installations. 
    more » « less