NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

How learners produce data from text in classifying clickbait

https://doi.org/10.1111/test.12339

Horton, Nicholas J.; Chao, Jie; Palmer, Phebe; Finzer, William (May 2023, Teaching Statistics)

Abstract Text provides a compelling example of unstructured data that can be used to motivate and explore classification problems. Challenges arise regarding the representation of features of text and student linkage between text representations as character strings and identification of features that embed connections with underlying phenomena. In order to observe how students reason with text data in scenarios designed to elicit certain aspects of the domain, we employed a task‐based interview method using a structured protocol with six pairs of undergraduate students. Our goal was to shed light on students' understanding of text as data using a motivating task to classify headlines as “clickbait” or “news.” Three types of features (function, content, and form) surfaced, the majority from the first scenario. Our analysis of the interviews indicates that this sequence of activities engaged the participants in thinking at both the human‐perception level and the computer‐extraction level and conceptualizing connections between them.
more » « less
An empirical analysis of high school students' practices of modelling with unstructured data

https://doi.org/10.1111/bjet.13253

Jiang, Shiyan; Nocera, Amato; Tatar, Cansu; Yoder, Michael_Miller; Chao, Jie; Wiedemann, Kenia; Finzer, William; Rosé, Carolyn_P (July 2022, British Journal of Educational Technology)

Abstract To date, many AI initiatives (eg, AI4K12, CS for All) developed standards and frameworks as guidance for educators to create accessible and engaging Artificial Intelligence (AI) learning experiences for K‐12 students. These efforts revealed a significant need to prepare youth to gain a fundamental understanding of how intelligence is created, applied, and its potential to perpetuate bias and unfairness. This study contributes to the growing interest in K‐12 AI education by examining student learning of modelling real‐world text data. Four students from an Advanced Placement computer science classroom at a public high school participated in this study. Our qualitative analysis reveals that the students developed nuanced and in‐depth understandings of how text classification models—a type of AI application—are trained. Specifically, we found that in modelling texts, students: (1) drew on their social experiences and cultural knowledge to create predictive features, (2) engineered predictive features to address model errors, (3) described model learning patterns from training data and (4) reasoned about noisy features when comparing models. This study contributes to an initial understanding of student learning of modelling unstructured data and offers implications for scaffolding in‐depth reasoning about model decision making. Practitioner notesWhat is already known about this topicScholarly attention has turned to examining Artificial Intelligence (AI) literacy in K‐12 to help students understand the working mechanism of AI technologies and critically evaluate automated decisions made by computer models.While efforts have been made to engage students in understanding AI through building machine learning models with data, few of them go in‐depth into teaching and learning of feature engineering, a critical concept in modelling data.There is a need for research to examine students' data modelling processes, particularly in the little‐researched realm of unstructured data.What this paper addsResults show that students developed nuanced understandings of models learning patterns in data for automated decision making.Results demonstrate that students drew on prior experience and knowledge in creating features from unstructured data in the learning task of building text classification models.Students needed support in performing feature engineering practices, reasoning about noisy features and exploring features in rich social contexts that the data set is situated in.Implications for practice and/or policyIt is important for schools to provide hands‐on model building experiences for students to understand and evaluate automated decisions from AI technologies.Students should be empowered to draw on their cultural and social backgrounds as they create models and evaluate data sources.To extend this work, educators should consider opportunities to integrate AI learning in other disciplinary subjects (ie, outside of computer science classes).
more » « less
High school students’ data modeling practices and processes: From modeling unstructured data to evaluating automated decisions

https://doi.org/10.1080/17439884.2023.2189735

Jiang, Shiyan; Tang, Hengtao; Tatar, Cansu; Rosé, Carolyn P.; Chao, Jie (January 2023, Learning, Media and Technology)

It’s critical to foster artificial intelligence (AI) literacy for high school students, the first generation to grow up surrounded by AI, to understand working mechanism of data-driven AI technologies and critically evaluate automated decisions from predictive models. While efforts have been made to engage youth in understanding AI through developing machine learning models, few provided in-depth insights into the nuanced learning processes. In this study, we examined high school students’ data modeling practices and processes. Twenty-eight students developed machine learning models with text data for classifying negative and positive reviews of ice cream stores. We identified nine data modeling practices that describe students’ processes of model exploration, development, and testing and two themes about evaluating automated decisions from data technologies. The results provide implications for designing accessible data modeling experiences for students to understand data justice as well as the role and responsibility of data modelers in creating AI technologies.
more » « less
Full Text Available
Spam Four Ways: Making Sense of Text Data

https://doi.org/10.1080/09332480.2022.2066414

Horton, Nicholas J.; Chao, Jie; Finzer, William; Palmer, Phebe (April 2022, CHANCE)

Full Text Available
Modeling Unstructured Data: Teachers as Learners and Designers of Technology-enhanced Artificial Intelligence Curriculum

Tatar, C.; Yoder, M. M.; Coven, M.; Wiedemann, K.; Chao, J.; Finzer, W.; Jiang, S.; Rosé, C. P. (October 2021, International Society of the Learning Sciences Annual Meeting 2021)
null (Ed.)
In this paper, we present a co-design study with teachers to contribute towards the development of a technology-enhanced Artificial Intelligence (AI) curriculum, focusing on modeling unstructured data. We created an initial design of a learning activity prototype and explored ways to incorporate the design into high school classes. Specifically, teachers explored text classification models with the prototype and reflected on the exploration as a user, learner, and teacher. They provided insights about learning opportunities in the activity and feedback for integrating it into their teaching. Findings from qualitative analysis demonstrate that exploring text classification models provided an accessible and comprehensive approach for integrated learning of mathematics, language arts, and computing with the potential of supporting the understanding of core AI concepts including identifying structure within unstructured data and reasoning about the roles of human insight in developing AI technologies.
more » « less
Full Text Available
FanfictionNLP: A Text Processing Pipeline for Fanfiction

Yoder, M. M.; Khosla, S.; Shen, Q.; Naik, A.; Jin, H.; Muralidharan, H.; Rosé, C. P. (October 2021, The 3rd Workshop on Narrative Understanding)
null (Ed.)
Fanfiction presents an opportunity as a data source for research in NLP, education, and social science. However, answering specific research questions with this data is difficult, since fanfiction contains more diverse writing styles than formal fiction. We present a text processing pipeline for fanfiction, with a fo- cus on identifying text associated with characters. The pipeline includes modules for character identification and coreference, as well as the attribution of quotes and narration to those characters. Additionally, the pipeline contains a novel approach to character coreference that uses knowledge from quote attribution to resolve pronouns within quotes. For each module, we evaluate the effectiveness of various approaches on 10 annotated fanfiction stories. This pipeline outperforms tools developed for formal fiction on the tasks of character coreference and quote attribution.
more » « less
Full Text Available
Modeling Unstructured Data: Teachers as Learners and Designers of Technology-enhanced Artificial Intelligence Curriculum. In de Vries, E., Hod, Y., & Ahn, J. (Eds.), (pp. 617-620). Bochum, Germany: International Society of the Learning Sciences.

https://doi.org/10.22318/icls2021.617

Tatar, C.; Yoder, M. M.; Coven, M.; Wiedemann, K.; Chao, J.; Finzer, W.; Jiang, S.; Rosé, C. P. (June 2021, Proceedings of the 15th International Conference of the Learning Sciences - ICLS 2021.)
de Vries, E.; Hod, Y.; Ahn, J. (Ed.)
In this paper, we present a co-design study with teachers to contribute towards development of a technology-enhanced Artificial Intelligence (AI) curriculum, focusing on modeling unstructured data. We created an initial design of a learning activity prototype and explored ways to incorporate the design into high school classes. Specifically, teachers explored text classification models with the prototype and reflected on the exploration as a user, learner, and teacher. They provided insights about learning opportunities in the activity and feedback for integrating it into their teaching. Findings from qualitative analysis demonstrate that exploring text classification models provided an accessible and comprehensive approach for integrated learning of mathematics, language arts, and computing with the potential of supporting the understanding of core AI concepts including identifying structure within unstructured data and reasoning about the roles of human insight in developing AI technologies.
more » « less
Full Text Available

Search for: All records