FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised Learning Features in Robust End-to-end Speech Recognition

Chen, Szu-Jui; Xie, Jiamin; Hansen, John H.L.

doi:10.21437/Interspeech.2022-10917

Citation Details

FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised Learning Features in Robust End-to-end Speech Recognition

Self-supervised learning representations (SSLR) have resulted in robust features for downstream tasks in many fields. Recently, several SSLRs have shown promising results on automatic speech recognition (ASR) benchmark corpora. However, previous studies have only shown performance for solitary SSLRs as an input feature for ASR models. In this study, we propose to investigate the effectiveness of diverse SSLR combinations using various fusion methods within end-to-end (E2E) ASR models. In addition, we will show there are correlations between these extracted SSLRs. As such, we further propose a feature refinement loss for decorrelation to efficiently combine the set of input features. For evaluation, we show that the proposed “FeaRLESS learning features” perform better than systems without the proposed feature refinement loss for both the WSJ and Fearless Steps Challenge (FSC) corpora. more »

Award ID(s):: 2016725

PAR ID:: 10402503

Author(s) / Creator(s):: Chen, Szu-Jui; Xie, Jiamin; Hansen, John H.L.

Date Published:: 2022-09-18

Journal Name:: ISCA INTERSPEECH-2022

Page Range / eLocation ID:: 3058 to 3062

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.21437/Interspeech.2022-10917

More Like this