skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Wave-Field Synthesis and Forced Perspective for Congruent Audiovisual Display
The spatial audio technique wave field synthesis has the ability to perceptually locate sound sources in front of the loudspeaker array. This can create a challenge for pairing congruent visuals when using a traditional projection screen, such as in many immersive environments. If the sound is placed within the environment beyond the boundary of the screen, how can visuals be displayed such that they appear congruent with their corresponding sound? For this, the technique known as forced perspective, where objects are made to appear of different scales or positions than they are in reality, is utilized. This allows for content to be created to appear as if it is spilling into the environment, and thus remain congruent with its respective sound. Assessments to determine perceptions and the effectiveness of audiovisual sources paired in this way are being planned and will be underway. The technique described here can be utilized for various digital experiences, from artistic performances to 3D advertisements.  more » « less
Award ID(s):
1909229
PAR ID:
10530623
Author(s) / Creator(s):
;
Corporate Creator(s):
Editor(s):
Weger, Marian; Ziemer, Tim; Ronnberg, Niklas
Publisher / Repository:
https://icad2023.icad.org/
Date Published:
Edition / Version:
1
Volume:
2023
Issue:
1
ISBN:
0-9670904-9-0
Page Range / eLocation ID:
165-167
Subject(s) / Keyword(s):
Audio Visual Immersive Systems multichannel sound forced perspective, sonification
Format(s):
Medium: X Size: 2MB Other: .pdf
Size(s):
2MB
Location:
https://repository.gatech.edu/entities/publication/6f73e8a9-ae9a-447f-a174-64ca28a2f3bb/full
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper proposes an experiential method for learning acoustics and consequences of room design through the rapid creation of audio-visual congruent walkable auralizations. An efficient method produces auralizations of acoustical landmarks using a two-dimensional ray-tracing algorithm and publicly available floor plans for a 128-channel wave-field synthesis system. Late reverberation parameters are calculated using additional volumetric data. Congruent visuals are produced using a web-based interface accessible via personal devices, which automatically formats for and transmits to the immersive display. Massive user-contributed online databases are harnessed through application programming interfaces, such as those offered by the Google Maps Platform, to provide near-instant access to innumerable locations. The approach allows the rapid sonic recreation of historical concert venues with adequate sound sources. Listeners can walk through these recreations over an extended user area (12 m × 10 m). 
    more » « less
  2. Generating realistic audio for human actions is critical for applications such as film sound effects and virtual reality games. Existing methods assume complete correspondence between video and audio during training, but in real-world settings, many sounds occur off-screen or weakly correspond to visuals, leading to uncontrolled ambient sounds or hallucinations at test time. This paper introduces AV-LDM, a novel ambient-aware audio generation model that disentangles foreground action sounds from ambient background noise in in-the-wild training videos. The approach leverages a retrieval-augmented generation framework to synthesize audio that aligns both semantically and temporally with the visual input. Trained and evaluated on Ego4D and EPIC-KITCHENS datasets, along with the newly introduced Ego4D-Sounds dataset (1.2M curated clips with action-audio correspondence), the model outperforms prior methods, enables controllable ambient sound generation, and shows promise for generalization to synthetic video game clips. This work is the first to emphasize faithful video-to-audio generation focused on observed visual content despite noisy, uncurated training data. 
    more » « less
  3. Kuperman, Alon (Ed.)
    Increasing the spatial and temporal density of data using networked sensors, known as the Internet of Things (IoT), can lead to enhanced productivity and cost savings in a host of industries. Where applications involve large outdoor expanses, such as farming, oil and gas, or defense, large regions of unelectrified land could yield significant benefits if instrumented with a high density of IoT systems. The major limitation of expanding IoT networks in such applications stems from the challenge of delivering power to each sensing device. Batteries, generators, and renewable sources have predominately been used to address the challenge, but these solutions require constant maintenance or are sensitive to environmental factors. This work presents a novel approach where conduction currents through soil are utilized for the wireless powering of sensor networks, initial investigation is within an 0.8-ha (2-acre) area. The technique is not line-of-sight, powers all devices simultaneously through near-field mechanics, and has the ability to be minimally invasive to the working environment. A theory of operation is presented and the technique is experimentally demonstrated in an agricultural setting. Scaling and transfer parameters are discussed. 
    more » « less
  4. Equilibrium is a challenging concept for many, largely because developing a deep conceptual understanding of equilibrium requires someone to be able to connect the motions and interactions of particles that cannot be physically observed with macroscopic observations. Particle level chemistry animations and simulations can support student connections of particle motion with macroscopic observations, but for topics such as equilibrium additional visuals such as graphs are typically present which add additional complexity. Helping students make sense of such visuals requires careful scaffolding to draw their attention to important features and help them make connections between representations ( e.g. , particle motion and graphical representations). Further, as students enter our classrooms with varying levels of background understanding, they may require more or less time working with such simulations or animations to develop the desired level of conceptual understanding. This paper describes the development and testing of activities that use the PhET simulation “Reactions and Rates” to introduce the concept of equilibrium as a student preclass activity either in the form of directly using the simulation or guided by an instructor through a screencast. The pre-post analysis of the two most recent implementations of these activities indicates that students show improved understanding of the core ideas underlying equilibrium regardless of instructor, institution, or type of instructional environment (face to face or remote). We also observed that students were more readily able to provide particle level explanations of changes in equilibrium systems as they respond to stresses (such as changes to concentration and temperature) if they have had prior course instruction on collision theory. Lastly, we observed that student answers to explain how an equilibrium will respond to an applied stress more often focus on either initial responses or longer-term stability of concentrations, not on both key aspects. 
    more » « less
  5. Antona, M; Stephanidis, C (Ed.)
    Environmental sounds can provide important information about surrounding activity, yet recognizing sounds can be challenging for Deaf and Hard-of-Hearing (DHH) individuals. Prior work has examined the preferences of DHH users for various sound-awareness methods. However, these preferences have been observed to vary along some demographic factors. Thus, in this study we investigate the preferences of a specific group of DHH users: current assistive listening devices users. Through a survey of 38 participants, we investigated their challenges and requirements for sound-awareness applications, as well as which type of sounds and what aspects of the sounds are of importance to them. We found that users of assistive listening devices still often miss sounds and rely on other people to obtain information about them. Participants indicated that the importance of awareness of different types of sounds varied according to the environment and the form factor of the sound-awareness technology. Congruent with prior work, participants reported that the location and urgency of the sound were of importance, as well as the confidence of the technology in its identification of that sound. 
    more » « less