skip to main content

Title: Toward an Automatic Speech Classifier for the Teacher
Our system classifies audio from microphones worn by the teacher in order to determine (1) whether the teacher is addressing the whole class or talking to individuals or groups of students. In the latter case, it determines (2) whether the teacher is giving formative feedback, giving corrective feedback, chatting socially, or addressing administrative or workflow concerns. This paper reports the initial accuracy of this system against human coding of middle school math classroom behavior. We also compared audio collected through professional hardware versus more accessible alternatives.
; ;
Award ID(s):
Publication Date:
Journal Name:
Artificial Intelligence in Education, AIED 2020
Page Range or eLocation-ID:
Sponsoring Org:
National Science Foundation
More Like this
  1. Researchers have long theorized that characteristics of education systems impact both perceived and experienced corruption in public schools. However, due to insufficient cross-national survey data with measures on corruption in education and unassembled yet publicly available institutional data, there are few empirical tests of this theory. This article provides the rare direct test of the relationship between corruption in European public schools and three education system factors: government expenditure on education, education staff compensation, and teacher workload (pupil–teacher ratio). With a newly constructed harmonized data set for European countries, and controlling for national economic factors and individual characteristics, results of multilevel analyses suggest partial support for the theory that specific institutional characteristics of education systems impact public school corruption. The theorized institutional factors have different effects that depend on whether we examine bribe-giving experience or corruption perception. Results show that bribe-giving experience in public schools of Europe is weakly yet significantly related to education staff compensation. For corruption perception, low levels of government expenditure on education and a lopsided pupil–teacher ratio (too few teachers per student) increase the probability that people view corruption as prevalent.
  2. When President Obama unveiled his plan to give all students in America the opportunity to learn computer science [1], discussions about computational thinking (CT) began in earnest in many organizations across a wide range of disciplines. However, Jeannette Wing stated the importance of CT for everyone a decade earlier in her landmark essay [2]. In recent years, several people and organizations have posted their own definition of CT, which presents a challenge in being able to assess CT understanding and awareness in people. In an effort to build consensus on how to best assess CT, the authors are developing a web-based tool that will enable CT experts globally to populate, review and rate questions that address various attributes of CT. Teaching Engineering Concepts to Harness Future Innovators and Technologists (TECHFIT) is an NSF-funded project that is examining the impact of the TECHFIT intervention based on the educational program’s delivery context. The CT Assessment System is being developed for TECHFIT as a standard way for teacher participants to gauge CT understanding in their students. It has been designed as a functional, web-based tool that supports management of the CT assessment questions database and giving different levels of access to various stakeholders,more »including the TECHFIT project team and academicians all over the world. The CT Assessment System includes features to enable authorized users to review, insert, and update a variety of questions in different formats. The level of access to this system is determined by the roles/permissions granted by the administrator. It also enables users to have the ability to rate the questions. The ratings are then aggregated to yield an overall rating value. The CT Assessment system has the capability to provide a clean, authentic and acceptable way to assess CT abilities via a common platform across the world. Attendees of the paper presentation will be invited to sign up and explore this tool to provide feedback for improvement of the tool.« less
  3. Speech and language development are early indicators of overall analytical and learning ability in children. The preschool classroom is a rich language environment for monitoring and ensuring growth in young children by measuring their vocal interactions with teachers and classmates. Early childhood researchers are naturally interested in analyzing naturalistic vs. controlled lab recordings to measure both quality and quantity of such interactions. Unfortunately, present-day speech technologies are not capable of addressing the wide dynamic scenario of early childhood classroom settings. Due to the diversity of acoustic events/conditions in such daylong audio streams, automated speaker diarization technology would need to be advanced to address this challenging domain for segmenting audio as well as information extraction. This study investigates an alternate Deep Learning-based diarization solution for segmenting classroom interactions of 3-5 year old children with teachers. In this context, the focus on speech-type diarization which classifies speech segments as being either from adults or children partitioned across multiple classrooms. Our proposed ResNet model achieves a best F1-score of ∼71.0% on data from two classrooms, based on dev and test sets of each classroom. Additionally, F1-scores are obtained for individual segments with corresponding speaker tags (e.g., adult vs. child), which provide knowledge formore »educators on child engagement through naturalistic communications. The study demonstrates the prospects of addressing educational assessment needs through communication audio stream analysis, while maintaining both security and privacy of all children and adults. The resulting child communication metrics have been used for broad-based feedback for teachers with the help of visualizations.« less
  4. Technological advancements and increased access have prompted the adoption of head- mounted display based virtual reality (VR) for neuroscientific research, manual skill training, and neurological rehabilitation. Applications that focus on manual interaction within the virtual environment (VE), especially haptic-free VR, critically depend on virtual hand-object collision detection. Knowledge about how multisensory integration related to hand-object collisions affects perception-action dynamics and reach-to-grasp coordination is needed to enhance the immersiveness of interactive VR. Here, we explored whether and to what extent sensory substitution for haptic feedback of hand-object collision (visual, audio, or audiovisual) and collider size (size of spherical pointers representing the fingertips) influences reach-to-grasp kinematics. In Study 1, visual, auditory, or combined feedback were compared as sensory substitutes to indicate the successful grasp of a virtual object during reach-to-grasp actions. In Study 2, participants reached to grasp virtual objects using spherical colliders of different diameters to test if virtual collider size impacts reach-to-grasp. Our data indicate that collider size but not sensory feedback modality significantly affected the kinematics of grasping. Larger colliders led to a smaller size-normalized peak aperture. We discuss this finding in the context of a possible influence of spherical collider size on the perception of the virtual object’smore »size and hence effects on motor planning of reach-to-grasp. Critically, reach-to-grasp spatiotemporal coordination patterns were robust to manipulations of sensory feedback modality and spherical collider size, suggesting that the nervous system adjusted the reach (transport) component commensurately to the changes in the grasp (aperture) component. These results have important implications for research, commercial, industrial, and clinical applications of VR.« less
  5. This work explores whether audio feedback style and user ability influences user techniques, performance, and preference in the interpretation of node graph data among sighted individuals and those who are blind or visually impaired. This study utilized a posttest-only basic randomized design comparing two treatments, in which participants listened to short audio clips describing a sequence of transitions occurring in a node graph. The results found that participants tend to use certain techniques and have corresponding preferences based on their ability. A correlation was also found between equivalently high feedback design performance and lack of overall feedback design preference. These results imply that universal technologies should consider avoiding utilizing design constraints that allow for only one optimal usage technique, especially if that technique is dependent on a user’s ability.