Multimodal Depression Classification using Articulatory Coordination Features and Hierarchical Attention Based text Embeddings

Seneviratne, Nadee; Espy-Wilson, Carol

doi:10.1109/ICASSP43922.2022.9747462

Citation Details

Multimodal Depression Classification using Articulatory Coordination Features and Hierarchical Attention Based text Embeddings

Multimodal depression classification has gained immense popularity over the recent years. We develop a multimodal depression classification system using articulatory coordination features extracted from vocal tract variables and text transcriptions obtained from an automatic speech recognition tool that yields improvements of area under the receiver operating characteristics curve compared to unimodal classifiers (7.5% and 13.7% for audio and text respectively). We show that in the case of limited training data, a segment-level classifier can first be trained to then obtain a session-wise prediction without hindering the performance, using a multi-stage convolutional recurrent neural network. A text model is trained using a Hierarchical Attention Network (HAN). The multimodal system is developed by combining embeddings from the session-level audio model and the HAN text model. more »

Award ID(s):: 2124270

PAR ID:: 10353243

Author(s) / Creator(s):: Seneviratne, Nadee; Espy-Wilson, Carol

Date Published:: 2022-05-23

Journal Name:: International Conference on Acoustics, Speech and Signal Processing

Page Range / eLocation ID:: 6252 to 6256

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/ICASSP43922.2022.9747462

More Like this