skip to main content

This content will become publicly available on May 6, 2024

Title: How learners produce data from text in classifying clickbait

Text provides a compelling example of unstructured data that can be used to motivate and explore classification problems. Challenges arise regarding the representation of features of text and student linkage between text representations as character strings and identification of features that embed connections with underlying phenomena. In order to observe how students reason with text data in scenarios designed to elicit certain aspects of the domain, we employed a task‐based interview method using a structured protocol with six pairs of undergraduate students. Our goal was to shed light on students' understanding of text as data using a motivating task to classify headlines as “clickbait” or “news.” Three types of features (function, content, and form) surfaced, the majority from the first scenario. Our analysis of the interviews indicates that this sequence of activities engaged the participants in thinking at both the human‐perception level and the computer‐extraction level and conceptualizing connections between them.

more » « less
Award ID(s):
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Date Published:
Journal Name:
Teaching Statistics
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Practitioner notes

    What is already known about this topic

    Scholarly attention has turned to examining Artificial Intelligence (AI) literacy in K‐12 to help students understand the working mechanism of AI technologies and critically evaluate automated decisions made by computer models.

    While efforts have been made to engage students in understanding AI through building machine learning models with data, few of them go in‐depth into teaching and learning of feature engineering, a critical concept in modelling data.

    There is a need for research to examine students' data modelling processes, particularly in the little‐researched realm of unstructured data.

    What this paper adds

    Results show that students developed nuanced understandings of models learning patterns in data for automated decision making.

    Results demonstrate that students drew on prior experience and knowledge in creating features from unstructured data in the learning task of building text classification models.

    Students needed support in performing feature engineering practices, reasoning about noisy features and exploring features in rich social contexts that the data set is situated in.

    Implications for practice and/or policy

    It is important for schools to provide hands‐on model building experiences for students to understand and evaluate automated decisions from AI technologies.

    Students should be empowered to draw on their cultural and social backgrounds as they create models and evaluate data sources.

    To extend this work, educators should consider opportunities to integrate AI learning in other disciplinary subjects (ie, outside of computer science classes).

    more » « less
  2. Abstract  
    more » « less
  3. Abstract  
    more » « less
  4. Abstract

    Using a mixed methods approach, we explore a relationship between students’ graph reasoning and graph selection via a fully online assessment. Our population includes 673 students enrolled in college algebra, an introductory undergraduate mathematics course, across four U.S. postsecondary institutions. The assessment is accessible on computers, tablets, and mobile phones. There are six items; for each, students are to view a video animation of a dynamic situation (e.g., a toy car moving along a square track), declare their understanding of the situation, select a Cartesian graph to represent a relationship between given attributes in the situation, and enter text to explain their graph choice. To theorize students’ graph reasoning, we draw on Thompson’s theory of quantitative reasoning, which explains students’ conceptions of attributes as being possible to measure. To code students’ written responses, we appeal to Johnson and colleagues’ graph reasoning framework, which distinguishes students’ quantitative reasoning about one or more attributes capable of varying (Covariation, Variation) from students’ reasoning about observable elements in a situation (Motion, Iconic). Quantitizing those qualitative codes, we examine connections between the latent variables of students’ graph reasoning and graph selection. Using structural equation modeling, we report a significant finding: Students’ graph reasoning explains 40% of the variance in their graph selection (standardized regression weight is 0.64,p < 0.001). Furthermore, our results demonstrate that students’ quantitative forms of graph reasoning (i.e., variational and covariational reasoning) influence the accuracy of their graph selection.

    more » « less
  5. It has been shown that intraoperative stress can have a negative effect on surgeon surgical skills during laparoscopic procedures. For novice surgeons, stressful conditions can lead to significantly higher velocity, acceleration, and jerk of the surgical instrument tips, resulting in faster but less smooth movements. However, it is still not clear which of these kinematic features (velocity, acceleration, or jerk) is the best marker for identifying the normal and stressed conditions. Therefore, in order to find the most significant kinematic feature that is affected by intraoperative stress, we implemented a spatial attention-based Long Short-Term Memory (LSTM) classifier. In a prior IRB approved experiment, we collected data from medical students performing an extended peg transfer task who were randomized into a control group and a group performing the task under external psychological stresses. In our prior work, we obtained “representative” normal or stressed movements from this dataset using kinematic data as the input. In this study, a spatial attention mechanism is used to describe the contribution of each kinematic feature to the classification of normal/stressed movements. We tested our classifier under Leave-One-User-Out (LOUO) cross-validation, and the classifier reached an overall accuracy of 77.11% for classifying “representative” normal and stressed movements using kinematic features as the input. More importantly, we also studied the spatial attention extracted from the proposed classifier. Velocity and acceleration on both sides had significantly higher attention for classifying a normal movement ([Formula: see text]); Velocity ([Formula: see text]) and jerk ([Formula: see text]) on nondominant hand had significant higher attention for classifying a stressed movement, and it is worthy noting that the attention of jerk on nondominant hand side had the largest increment when moving from describing normal movements to stressed movements ([Formula: see text]). In general, we found that the jerk on nondominant hand side can be used for characterizing the stressed movements for novice surgeons more effectively. 
    more » « less