Towards Better Meta-Initialization with Task Augmentation for Kindergarten-Aged Speech Recognition

Zhu, Yunzheng; Fan, Ruchao; Alwan, Abeer

doi:10.1109/ICASSP43922.2022.9747599

Citation Details

Towards Better Meta-Initialization with Task Augmentation for Kindergarten-Aged Speech Recognition

Children’s automatic speech recognition (ASR) is always difficult due to, in part, the data scarcity problem, especially for kindergarten-aged kids. When data are scarce, the model might overfit to the training data, and hence good starting points for training are essential. Recently, meta-learning was proposed to learn model initialization (MI) for ASR tasks of different languages. This method leads to good performance when the model is adapted to an unseen language. How-ever, MI is vulnerable to overfitting on training tasks (learner overfitting). It is also unknown whether MI generalizes to other low-resource tasks. In this paper, we validate the effectiveness of MI in children’s ASR and attempt to alleviate the problem of learner overfitting. To achieve model-agnostic meta-learning (MAML), we regard children’s speech at each age as a different task. In terms of learner overfitting, we propose a task-level augmentation method by simulating new ages using frequency warping techniques. Detailed experiments are conducted to show the impact of task augmentation on each age for kindergarten-aged speech. As a result, our approach achieves a relative word error rate (WER) improvement of 51% over the baseline system with no augmentation or initialization. more »

Award ID(s):: 1734380

PAR ID:: 10354759

Author(s) / Creator(s):: Zhu, Yunzheng; Fan, Ruchao; Alwan, Abeer

Date Published:: 2022-05-23

Journal Name:: Proceedings of the IEEE ICASSP

Page Range / eLocation ID:: 8582 to 8586

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/ICASSP43922.2022.9747599

More Like this