Fearless steps Apollo: Advancements in robust speech technologies and naturalistic corpus development from Earth to the Moon

Hansen, J.H.L.; Joglekar, A.; Chandra-Shekar, M.

Citation Details

Recent developments in deep learning strategies have revolutionized Speech and Language Technologies(SLT). Deep learning models often rely on massive naturalistic datasets to produce the necessary complexity required for generating superior performance. However, most massive SLT datasets are not publicly available, limiting the potential for academic research. Through this work, we showcase the CRSS-UTDallas led efforts to recover, digitize, and openly distribute over 50,000 hrs of speech data recorded during the 12 NASA Apollo manned missions, and outline our continuing efforts to digitize and create meta-data through diarization of the remaining 100,000hrs. We present novel deep learning-based speech processing solutions developed to extract high-level information from this massive dataset. Fearless-Steps APOLLO resource is a 50,000 hrs audio collection from 30-track analog tapes originally used to document Apollo missions 1,7,8,10,11,&13. A customized tape read-head developed to digitize all 30 channels simultaneously has been deployed to expedite digitization of remaining mission tapes. Diarized transcripts for these unlabeled audio communications have also been generated to facilitate open research from speech sciences, historical archives, education, and speech technology communities. Robust technologies developed to generate human-readable transcripts include: (i) speaker diarization, (ii) speaker tracking, and (iii) text output from speech recognition systems. more »

Award ID(s):: 2016725

PAR ID:: 10484506

Author(s) / Creator(s):: Hansen, J.H.L.; Joglekar, A.; Chandra-Shekar, M.

Publisher / Repository:: Acoustical Society of America (Fall Meeting)

Date Published:: 2022-12-05

Journal Name:: Acoustical Society of America (Fall Meeting)

Volume:: 152

Page Range / eLocation ID:: A61

Format(s):: Medium: X

Location:: Nashville, TN

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Proceeding:
The DOI is not currently available.

More Like this