- Award ID(s):
- Publication Date:
- NSF-PAR ID:
- Journal Name:
- Proceedings of Interspeech 2019
- Page Range or eLocation-ID:
- 3163 to 3167
- Sponsoring Org:
- National Science Foundation
More Like this
Speech enhancement is an essential component in robust automatic speech recognition (ASR) systems. Most speech enhancement methods are nowadays based on neural networks that use feature-mapping or mask-learning. This paper proposes a novel speech enhancement method that integrates time-domain feature mapping and mask learning into a unified framework using a Generative Adversarial Network (GAN). The proposed framework processes the received waveform and decouples speech and noise signals, which are fed into two short-time Fourier transform (STFT) convolution 1-D layers that map the waveforms to spectrograms in the complex domain. These speech and noise spectrograms are then used to compute the speech mask loss. The proposed method is evaluated using the TIMIT data set for seen and unseen signal-to-noise ratio conditions. It is shown that the proposed method outperforms the speech enhancement methods that use Deep Neural Network (DNN) based speech enhancement or a Speech Enhancement Generative Adversarial Network (SEGAN).
On-chip spectrometers have the potential to offer dramatic size, weight, and power advantages over conventional benchtop instruments for many applications such as spectroscopic sensing, optical network performance monitoring, hyperspectral imaging, and radio-frequency spectrum analysis. Existing on-chip spectrometer designs, however, are limited in spectral channel count and signal-to-noise ratio. Here we demonstrate a transformative on-chip digital Fourier transform spectrometer that acquires high-resolution spectra via time-domain modulation of a reconfigurable Mach-Zehnder interferometer. The device, fabricated and packaged using industry-standard silicon photonics technology, claims the multiplex advantage to dramatically boost the signal-to-noise ratio and unprecedented scalability capable of addressing exponentially increasing numbers of spectral channels. We further explore and implement machine learning regularization techniques to spectrum reconstruction. Using an ‘elastic-D1’ regularized regression method that we develop, we achieved significant noise suppression for both broad (>600 GHz) and narrow (<25 GHz) spectral features, as well as spectral resolution enhancement beyond the classical Rayleigh criterion.
Obeid, Iyad Selesnick (Ed.)Electroencephalography (EEG) is a popular clinical monitoring tool used for diagnosing brain-related disorders such as epilepsy . As monitoring EEGs in a critical-care setting is an expensive and tedious task, there is a great interest in developing real-time EEG monitoring tools to improve patient care quality and efficiency . However, clinicians require automatic seizure detection tools that provide decisions with at least 75% sensitivity and less than 1 false alarm (FA) per 24 hours . Some commercial tools recently claim to reach such performance levels, including the Olympic Brainz Monitor  and Persyst 14 . In this abstract, we describe our efforts to transform a high-performance offline seizure detection system  into a low latency real-time or online seizure detection system. An overview of the system is shown in Figure 1. The main difference between an online versus offline system is that an online system should always be causal and has minimum latency which is often defined by domain experts. The offline system, shown in Figure 2, uses two phases of deep learning models with postprocessing . The channel-based long short term memory (LSTM) model (Phase 1 or P1) processes linear frequency cepstral coefficients (LFCC)  features from each EEGmore »
There has been a flurry of recent literature studying streaming algorithms for which the input stream is chosen adaptively by a black-box adversary who observes the output of the streaming algorithm at each time step. However, these algorithms fail when the adversary has access to the internal state of the algorithm, rather than just the output of the algorithm. We study streaming algorithms in the white-box adversarial model, where the stream is chosen adaptively by an adversary who observes the entire internal state of the algorithm at each time step. We show that nontrivial algorithms are still possible. We first give a randomized algorithm for the L1-heavy hitters problem that outperforms the optimal deterministic Misra-Gries algorithm on long streams. If the white-box adversary is computationally bounded, we use cryptographic techniques to reduce the memory of our L1-heavy hitters algorithm even further and to design a number of additional algorithms for graph, string, and linear algebra problems. The existence of such algorithms is surprising, as the streaming algorithm does not even have a secret key in this model, i.e., its state is entirely known to the adversary. One algorithm we design is for estimating the number of distinct elements in amore »
Partitioning local seismogram wavefields using continuous wavelet transform methods for IRIS wavefield experiment arrays
We applied nonlinear thresholding and scale–time gating in the continuous wavelet transform (CWT) domain to denoise, identify and characterize seismic phases contained in gradiometer and phased array waveforms of four seismic events recorded during the 2016 Incorporated Research Institutions of Seismology Wavefields Experiment in northern Oklahoma. A dense, 80-element three component phased array was subset from the linear array deployments to examine background noise, waveform coherence and seismic wave composition for local explosion and earthquake waveforms. CWT techniques were also used to significantly improve gradiometery analyses for data recorded by the geodetic array subexperiment. We observed as much as two orders of magnitude gain in the data signal-to-noise ratio. We also saw improvement in array beam quality after denoising the seismic data. Using the signal partitioning technique, we were able to extract and identify many phases based on their positions on the scale–time plane. CWT denoising and wavefield decomposition techniques also improved gradiometry analysis results from the 112-element geodetic array (also called the gradiometer) since waves could be separated before the computation of wave attributes. The operations of removing noise and gating out signal phases improved signal coherence across array records and provided clear P wave onsets on horizontalmore »