Noise-Robust Speech Emotion Recognition Using Shared Self-Supervised Representations with Integrated Speech Enhancement

Tzeng, Jing-Tong; Leem, Seong-Gyun; Salman, Ali N; Lee, Chi-Chun; Busso, Carlos

doi:10.1109/ICASSP49660.2025.10887569

Citation Details

This content will become publicly available on April 6, 2026

Noise-Robust Speech Emotion Recognition Using Shared Self-Supervised Representations with Integrated Speech Enhancement

Recent studies have demonstrated the effectiveness of fine-tuning self-supervised speech representation models for speech emotion recognition (SER). However, applying SER in real-world environments remains challenging due to pervasive noise. Relying on low-accuracy predictions due to noisy speech can undermine the user’s trust. This paper proposes a unified self-supervised speech representation framework for enhanced speech emotion recognition designed to increase noise robustness in SER while generating enhanced speech. Our framework integrates speech enhancement (SE) and SER tasks, leveraging shared self-supervised learning (SSL)-derived features to improve emotion classification performance in noisy environments. This strategy encourages the SE module to enhance discriminative information for SER tasks. Additionally, we introduce a cascade unfrozen training strategy, where the SSL model is gradually unfrozen and fine-tuned alongside the SE and SER heads, ensuring training stability and preserving the generalizability of SSL representations. This approach demonstrates improvements in SER performance under unseen noisy conditions without compromising SE quality. When tested at a 0 dB signal-to-noise ratio (SNR) level, our proposed method outperforms the original baseline by 3.7% in F1-Macro and 2.7% in F1-Micro scores, where the differences are statistically significant. more »

Award ID(s):: 2016719

PAR ID:: 10655472

Author(s) / Creator(s):: Tzeng, Jing-Tong ; Leem, Seong-Gyun ; Salman, Ali N ; Lee, Chi-Chun ; Busso, Carlos

Publisher / Repository:: IEEE

Date Published:: 2025-04-06

Page Range / eLocation ID:: 1 to 5

Format(s):: Medium: X

Location:: Hyderabad, India

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on April 6, 2026
Conference Paper:
https://doi.org/10.1109/ICASSP49660.2025.10887569

More Like this