skip to main content


Title: Accurate and Interpretable Sensor-free Affect Detectors via Monotonic Neural Networks
Sensor-free affect detectors can detect student affect using their activities within intelligent tutoring systems or other online learning environments rather than using sensors. This technology has made affect detection more scalable and less invasive. However, existing detectors are either interpretable but less accurate (e.g., classical algorithms such as logistic regression) or more accurate but uninterpretable (e.g., neural networks). We investigate the use of a new type of neural networks that are monotonic after the first layer for affect detection that can strike a balance between accuracy and interpretability. Results on a real- world student affect dataset show that monotonic neural networks achieve comparable detection accuracy to their non-monotonic counterparts while offering some level of interpretability.  more » « less
Award ID(s):
1724889
NSF-PAR ID:
10157371
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proceedings of the 10th International Conference on Learning Analytics and Knowledge
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Sensor-free affect detectors can detect student affect using their activities within intelligent tutoring systems or other online learning environments rather than using sensors. This technology has made affect detection more scalable and less invasive. However, existing detectors are either interpretable but less accurate (e.g., classical algorithms such as logistic regression) or more accurate but uninterpretable (e.g., neural networks). We investigate the use of a new type of neural networks that are monotonic after the first layer for affect detection that can strike a balance between accuracy and interpretability. Results on a real world student affect dataset show that monotonic neural networks achieve comparable detection accuracy to their non-monotonic counterparts while offering some level of interpretability. 
    more » « less
  2. Abstract

    Mountain meadows are an essential part of the alpine–subalpine ecosystem; they provide ecosystem services like pollination and are home to diverse plant communities. Changes in climate affect meadow ecology on multiple levels, for example, by altering growing season dynamics. Tracking the effects of climate change on meadow diversity through the impacts on individual species and overall growing season dynamics is critical to conservation efforts. Here, we explore how to combine crowd‐sourced camera images with machine learning to quantify flowering species richness across a range of elevations in alpine meadows located in Mt. Rainier National Park, Washington, USA. We employed three machine‐learning techniques (Mask R‐CNN, RetinaNet and YOLOv5) to detect wildflower species in images taken during two flowering seasons. We demonstrate that deep learning techniques can detect multiple species, providing information on flowering richness in photographed meadows. The results indicate higher richness just above the tree line for most of the species, which is comparable with patterns found using field studies. We found that the two‐stage detector Mask R‐CNN was more accurate than single‐stage detectors like RetinaNet and YOLO, with the Mask R‐CNN network performing best overall with mean average precision (mAP) of 0.67 followed by RetinaNet (0.5) and YOLO (0.4). We found that across the methods using anchor box variations in multiples of 16 led to enhanced accuracy. We also show that detection is possible even when pictures are interspersed with complex backgrounds and are not in focus. We found differential detection rates depending on species abundance, with additional challenges related to similarity in flower characteristics, labeling errors and occlusion issues. Despite these potential biases and limitations in capturing flowering abundance and location‐specific quantification, accuracy was notable considering the complexity of flower types and picture angles in this dataset. We, therefore, expect that this approach can be used to address many ecological questions that benefit from automated flower detection, including studies of flowering phenology and floral resources, and that this approach can, therefore, complement a wide range of ecological approaches (e.g., field observations, experiments, community science, etc.). In all, our study suggests that ecological metrics like floral richness can be efficiently monitored by combining machine learning with easily accessible publicly curated datasets (e.g., Flickr, iNaturalist).

     
    more » « less
  3. Collaboration is a 21st Century skill as well as an effective method for learning, so detection of collaboration is important for both assessment and instruction. Speech-based collaboration detection can be quite accurate but collecting the speech of students in classrooms can raise privacy issues. An alternative is to send only whether or not the student is speaking. That is, the speech signal is processed at the microphone by a voice activity detector before being transmitted to the collaboration detector. Because the transmitted signal is binary (1 = speaking, 0 = silence), this method mitigates privacy issues. However, it may harm the accuracy of collaboration detection. To find out how much harm is done, this study compared the relative effectiveness of collaboration detectors based either on the binary signal or high-quality audio. Pairs of students were asked to work together on solving complex math problems. Three qualitative levels of interactivity was distinguished: Interaction, Cooperation and Other. Human coders used richer data (several audio and video streams) to choose the code for each episode. Machine learning was used to induce a detector to assign a code for every episode based on the features. The binary-based collaboration detectors delivered only slightly less accuracy than collaboration detectors based on the high quality audio signal. 
    more » « less
  4. Automatic machine learning-based detectors of various psychological and social phenomena (e.g., emotion, stress, engagement) have great potential to advance basic science. However, when a detector d is trained to approximate an existing measurement tool (e.g., a questionnaire, observation protocol), then care must be taken when interpreting measurements collected using d since they are one step further removed from the under- lying construct. We examine how the accuracy of d, as quantified by the correlation q of d’s out- puts with the ground-truth construct U, impacts the estimated correlation between U (e.g., stress) and some other phenomenon V (e.g., academic performance). In particular: (1) We show that if the true correlation between U and V is r, then the expected sample correlation, over all vectors T n whose correlation with U is q, is qr. (2) We derive a formula for the probability that the sample correlation (over n subjects) using d is positive given that the true correlation is negative (and vice-versa); this probability can be substantial (around 20 - 30%) for values of n and q that have been used in recent affective computing studies. (3) With the goal to reduce the variance of correlations estimated by an automatic detector, we show that training multiple neural networks d(1) , . . . , d(m) using different training architectures and hyperparameters for the same detection task provides only limited “coverage” of T^n. 
    more » « less
  5. IEEE Open Journal of the Computer Society (Ed.)
    While neural networks have been achieving increasingly significant excitement in solving classification tasks such as natural language processing, their lack of interpretability becomes a great challenge for neural networks to be deployed in certain high-stakes human-centered applications. To address this issue, we propose a new approach for generating interpretable predictions by inferring a simple three-layer neural network with threshold activations, so that it can benefit from effective neural network training algorithms and at the same time, produce human-understandable explanations for the results. In particular, the hidden layer neurons in the proposed model are trained with floating point weights and binary output activations. The output neuron is also trainable as a threshold logic function that implements a disjunctive operation, forming the logical-OR of the first-level threshold logic functions. This neural network can be trained using state-of-the-art training methods to achieve high prediction accuracy. An important feature of the proposed architecture is that only a simple greedy algorithm is required to provide an explanation with the prediction that is human-understandable. In comparison with other explainable decision models, our proposed approach achieves more accurate predictions on a broad set of tabular data classification datasets. 
    more » « less