skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 2104758

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available January 1, 2026
  2. The dataset includes impulse responses recorded from 14 different rooms. Each room has unique acoustic properties, providing a wide range of RT60, clarity, and EDT values. The recordings are in 48kHz, 32bit, mono WAV files. The dataset is organized by room, with each subfolder containing the impulse responses specific to that room, as well as a general layout of each room and plots of acoustic data.This dataset supports Estimating direction of arrival in reverberant environments for wake-word detection using a single structural vibration sensor, published in the Journal of the Acoustical Society of America, Vol. 156, Iss. 4, October, 2024.If you plan to download this dataset, we would appreciate it very much if you could fill out the Google form at https://forms.gle/jnuP2dYRK3CPmXQG6. This will help us understand the usage and impacts of this dataset. Your feedback will also help us improve any future extensions of this work. 
    more » « less
  3. The vibrational response of an elastic panel to incident acoustic waves is determined by the direction-of-arrival (DOA) of the waves relative to the spatial structure of the panel's bending modes. By monitoring the relative modal excitations of a panel immersed in a sound field, the DOA of the source may be inferred. In reverberant environments, early acoustic reflections and the late diffuse acoustic field may obscure the DOA of incoming sound waves. Panel microphones may be especially susceptible to the effects of reverberation due to their large surface areas and long-decaying impulse responses. An investigation into the effect of reverberation on the accuracy of DOA estimation with panel microphones was made by recording wake-word utterances in eight spaces with reverberation times (RT60s) ranging from 0.27 to 3.00 s. The responses were used to train neural networks to estimate the DOA. Within ±5°, DOA estimation reliability was measured at 95.00% in the least reverberant space, decreasing to 78.33% in the most reverberant space, suggesting an inverse relationship between RT60 and DOA accuracy. Experimental results suggest that a system for estimating DOA with panel microphones can generalize to new acoustic environments by cross-training the system with data from multiple spaces with different RT60s. 
    more » « less
  4. This dataset contains a collection of voice commands for a smart speaker, each beginning with the common wake-word "Hey Alexa". The commands cover a range of tasks such as music control, smart home management, information requests, reminders, shopping, entertainment, and communication. The dataset reflects natural language usage from a diverse group of speakers, capturing various phrasings, inflections, and contexts. It includes contributions from both male and female voices and features speakers with different native languages.If you plan to download this dataset, we would appreciate it very much if you could fill out the Google form at https://forms.gle/dixQ4mkZ4xbXtXRDA. This will help us understand the usage and impacts of this dataset. Your feedback will also help us improve any future extensions of this work. 
    more » « less
  5. The direction of arrival (DOA) of an acoustic source is a signal characteristic used by smart audio devices to enable signal enhancement algorithms. Though DOA estimations are traditionally made using a multi-microphone array, we propose that the resonant modes of a surface excited by acoustic waves contain sufficient spatial information that DOA may be estimated using a singular structural vibration sensor. In this work, sensors are affixed to an acrylic panel and used to record acoustic noise signals at various angles of incidence. From these recordings, feature vectors containing the sums of the energies in the panel’s isolated modal regions are extracted and used to train deep neural networks to estimate DOA. Experimental results show that when all 13 of the acrylic panel’s isolated modal bands are utilized, the DOA of incident acoustic waves for a broadband noise signal may be estimated by a single structural sensor to within ±5° with a reliability of 98.4%. The size of the feature set may be reduced by eliminating the resonant modes that do not have strong spatial coupling to the incident acoustic wave. Reducing the feature set to the 7 modal bands that provide the most spatial information produces a reliability of 89.7% for DOA estimates within ±5° using a single sensor. 
    more » « less
  6. The microphone systems employed by smart devices such as cellphones and tablets require case penetrations that leave them vulnerable to environmental damage. A structural sensor mounted on the back of the display screen can be employed to record audio by capturing the bending vibration signals induced in the display panel by an incident acoustic wave - enabling a functional microphone on a fully sealed device. Distributed piezoelectric sensing elements and low-noise accelerometers were bonded to the surfaces of several different panels and used to record acoustic speech signals. The quality of the recorded signals was assessed using the speech transmission index, and the recordings were transcribed to text using an automatic speech recognition system. Although the quality of the speech signals recorded by the piezoelectric sensors was reduced compared to the quality of speech recorded by the accelerometers, the word-error-rate of each transcription increased only by approximately 2% on average, suggesting that distributed piezoelectric sensors can be used as a low-cost surface microphone for smart devices that employ automatic speech recognition. A method of crosstalk cancellation was also implemented to enable the simultaneous recording and playback of audio signals by an array of piezoelectric elements and evaluated by the measured improvement in the recording’s signal-to-interference ratio. 
    more » « less
  7. Devices from smartphones to televisions are beginning to employ dual purpose displays, where the display serves as both a video screen and a loudspeaker. In this paper we demonstrate a method to generate localized sound-radiating regions on a flat-panel display. An array of force actuators affixed to the back of the panel is driven by appropriately filtered audio signals so the total response of the panel due to the actuator array approximates a target spatial acceleration profile. The response of the panel to each actuator individually is initially measured via a laser vibrometer, and the required actuator filters for each source position are determined by an optimization procedure that minimizes the mean squared error between the reconstructed and targeted acceleration profiles. Since the single-actuator panel responses are determined empirically, the method does not require analytical or numerical models of the system’s modal response, and thus is well-suited to panels having the complex boundary conditions typical of television screens, mobile devices, and tablets. The method is demonstrated on two panels with differing boundary conditions. When integrated with display technology, the localized audio source rendering method may transform traditional displays into multimodal audio-visual interfaces by colocating localized audio sources and objects in the video stream. 
    more » « less