Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
Real-time Speaker counting in a cocktail party scenario using Attention-guided Convolutional Neural NetworkMost current speech technology systems are designed to operate well even in the presence of multiple active speakers. However, most solutions assume that the number of co-current speakers is known. Unfortunately, this information might not always be available in real-world applications. In this study, we propose a real-time, single-channel attention-guided Convolutional Neural Network (CNN) to estimate the number of active speakers in overlapping speech. The proposed system extracts higher-level information from the speech spectral content using a CNN model. Next, the attention mechanism summarizes the extracted information into a compact feature vector without losing critical information. Finally, the active speakersmore »Free, publicly-accessible full text available September 1, 2022
Child vs Adult Speaker Diarization of naturalistic audio recordings in preschool environment using Deep Neural NetworksSpeech and language development in children is crucial for ensuring optimal outcomes in their long term development and life-long educational journey. A child’s vocabulary size at the time of kindergarten entry is an early indicator of learning to read and potential long-term success in school. The preschool classroom is thus a promising venue for monitoring growth in young children by measuring their interactions with teachers and classmates. Automatic Speech Recognition (ASR) technologies provide the ability for ‘Early Childhood’ researchers for automatically analyzing naturalistic recordings in these settings. For this purpose, data are collected in a high-quality childcare center in themore »Free, publicly-accessible full text available March 1, 2022
A key goal of Next Generation Science Standards is to promote interest and exploration of natural phenomena. In preschool settings, teachers prompt exploration by asking questions, encouraging informal exploration and experimentation. To date, live or offline video observation has been the sole way to capture the quality of teacher question asking in the pre-k classroom (e.g., Sanders et al., 2016). To date, Automatic Speech Recognition (ASR) has not been used to measure the content/quality of teacher talk. Here, we used ASR to quantify preschool teachers’ use of keywords that promote student exploration and inquiry.