Speech Enhancement Using Forked Generative Adversarial Networks with Spectral Subtraction

Lin, Ju; Niu, Sufeng; Wei, Zice; Lan, Xiang; Wijngaarden, Adriaan J.; Smith, Melissa C.; Wang, Kuang-Ching

doi:10.21437/Interspeech.2019-2954

Citation Details

Speech Enhancement Using Forked Generative Adversarial Networks with Spectral Subtraction

Speech enhancement techniques that use a generative adversarial network (GAN) can effectively suppress noise while allowing models to be trained end-to-end. However, such techniques directly operate on time-domain waveforms, which are often highly-dimensional and require extensive computation. This paper proposes a novel GAN-based speech enhancement method, referred to as S-ForkGAN, that operates on log-power spectra rather than on time-domain speech waveforms, and uses a forked GAN structure to extract both speech and noise information. By operating on log-power spectra, one can seamlessly include conventional spectral subtraction techniques, and the parameter space typically has a lower dimension. The performance of S-ForkGAN is assessed for automatic speech recognition (ASR) using the TIMIT data set and a wide range of noise conditions. It is shown that S-ForkGAN outperforms existing GAN-based techniques and that it has a lower complexity. more »

Award ID(s):: 1725573

PAR ID:: 10203595

Author(s) / Creator(s):: Lin, Ju; Niu, Sufeng; Wei, Zice; Lan, Xiang; Wijngaarden, Adriaan J.; Smith, Melissa C.; Wang, Kuang-Ching

Date Published:: 2019-09-15

Journal Name:: Proceedings of Interspeech 2019

Page Range / eLocation ID:: 3163 to 3167

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.21437/Interspeech.2019-2954

More Like this