Bilingual Speech Recognition by Estimating Speaker Geometry from Video Data

Tapia, L.; Gomez, A.; Esparza, M.; Jatla, V.; Pattichis, M.; Celedón-Pattichis, S.; LópezLeiva, C.

doi:10.1007/978-3-030-89128-2_8

Citation Details

Bilingual Speech Recognition by Estimating Speaker Geometry from Video Data

Speech recognition is very challenging in student learning environments that are characterized by significant cross-talk and background noise. To address this problem, we present a bilingual speech recognition system that uses an interactive video analysis system to estimate the 3D speaker geometry for realistic audio simulations. We demonstrate the use of our system in generating a complex audio dataset that contains significant cross-talk and background noise that approximate real-life classroom recordings. We then test our proposed system with real-life recordings. In terms of the distance of the speakers from the microphone, our interactive video analysis system obtained a better average error rate of 10.83% compared to 33.12% for a baseline approach. Our proposed system gave an accuracy of 27.92% that is 1.5% better than Google Speech-to-text on the same dataset. In terms of 9 important keywords, our approach gave an average sensitivity of 38% compared to 24% for Google Speech-to-text, while both methods maintained high average specificity of 90% and 92%. On average, sensitivity improved from 24% to 38% for our proposed approach. On the other hand, specificity remained high for both methods (90% to 92%). more »

Award ID(s):: 1842220 1949230 1613637

PAR ID:: 10310065

Author(s) / Creator(s):: Tapia, L.; Gomez, A.; Esparza, M.; Jatla, V.; Pattichis, M.; Celedón-Pattichis, S.; LópezLeiva, C.

Date Published:: 2021-10-31

Journal Name:: Computer Analysis of Images and Patterns. CAIP 2021. Lecture Notes in Computer Science.

Volume:: 13052

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1007/978-3-030-89128-2_8

More Like this