Search for: All records

Creators/Authors contains: "Wijngaarden, Adriaan J."

« Prev Next »

Total Resources

2

Resource Type
Conference Paper

2

Conference Proceeding

0

Dataset

0

Journal Article

0

Workshop Report

0

Availability
Full Text / Resource Available

2

Citation Only

0

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Improved Speech Enhancement Using a Time-Domain GAN with Mask Learning

https://doi.org/10.21437/Interspeech.2020-1946

Lin, Ju ; Niu, Sufeng ; Wijngaarden, Adriaan J. ; McClendon, Jerome L. ; Smith, Melissa C. ; Wang, Kuang-Ching ( October 2020 , Proceedings of Interspeech 2020)
null (Ed.)
Speech enhancement is an essential component in robust automatic speech recognition (ASR) systems. Most speech enhancement methods are nowadays based on neural networks that use feature-mapping or mask-learning. This paper proposes a novel speech enhancement method that integrates time-domain feature mapping and mask learning into a unified framework using a Generative Adversarial Network (GAN). The proposed framework processes the received waveform and decouples speech and noise signals, which are fed into two short-time Fourier transform (STFT) convolution 1-D layers that map the waveforms to spectrograms in the complex domain. These speech and noise spectrograms are then used to compute the speech mask loss. The proposed method is evaluated using the TIMIT data set for seen and unseen signal-to-noise ratio conditions. It is shown that the proposed method outperforms the speech enhancement methods that use Deep Neural Network (DNN) based speech enhancement or a Speech Enhancement Generative Adversarial Network (SEGAN).
more » « less
Full Text Available
Speech Enhancement Using Forked Generative Adversarial Networks with Spectral Subtraction

https://doi.org/10.21437/Interspeech.2019-2954

Lin, Ju ; Niu, Sufeng ; Wei, Zice ; Lan, Xiang ; Wijngaarden, Adriaan J. ; Smith, Melissa C. ; Wang, Kuang-Ching ( September 2019 , Proceedings of Interspeech 2019)
null (Ed.)
Speech enhancement techniques that use a generative adversarial network (GAN) can effectively suppress noise while allowing models to be trained end-to-end. However, such techniques directly operate on time-domain waveforms, which are often highly-dimensional and require extensive computation. This paper proposes a novel GAN-based speech enhancement method, referred to as S-ForkGAN, that operates on log-power spectra rather than on time-domain speech waveforms, and uses a forked GAN structure to extract both speech and noise information. By operating on log-power spectra, one can seamlessly include conventional spectral subtraction techniques, and the parameter space typically has a lower dimension. The performance of S-ForkGAN is assessed for automatic speech recognition (ASR) using the TIMIT data set and a wide range of noise conditions. It is shown that S-ForkGAN outperforms existing GAN-based techniques and that it has a lower complexity.
more » « less
Full Text Available