Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract Automated, data‐driven decision making is increasingly common in a variety of application domains. In educational software, for example, machine learning has been applied to tasks like selecting the next exercise for students to complete. Machine learning methods, however, are not always equally effective for all groups of students. Current approaches to designing fair algorithms tend to focus on statistical measures concerning a small subset of legally protected categories like race or gender. Focusing solely on legally protected categories, however, can limit our understanding of bias and unfairness by ignoring the complexities of identity. We propose an alternative approach to categorization, grounded in sociological techniques of measuring identity. By soliciting survey data and interviews from the population being studied, we can build context‐specific categories from the bottom up. The emergent categories can then be combined with extant algorithmic fairness strategies to discover which identity groups are not well‐served, and thus where algorithms should be improved or avoided altogether. We focus on educational applications but present arguments that this approach should be adopted more broadly for issues of algorithmic fairness across a variety of applications.more » « less
-
Algorithmic bias research often evaluates models in terms of traditional demographic categories (e.g., U.S. Census), but these categories may not capture nuanced, context-dependent identities relevant to learning. This study evaluates four affect detectors (boredom, confusion, engaged concentration, and frustration) developed for an adaptive math learning system. Metrics for algorithmic fairness (AUC, weighted F1, MADD) show subgroup differences across several categories that emerged from a free-response social identity survey (Twenty Statements Test; TST), including both those that mirror demographic categories (i.e., race and gender) as well as novel categories (i.e., Learner Identity, Interpersonal Style, and Sense of Competence). For demographic categories, the confusion detector performs better for boys than for girls and underperforms for West African students. Among novel categories, biases are found related to learner identity (boredom, engaged concentration, and confusion) and interpersonal style (confusion), but not for sense of competence. Results highlight the importance of using contextually grounded social identities to evaluate bias.more » « lessFree, publicly-accessible full text available December 1, 2026
-
Adaptive learning systems are increasingly common in U.S. classrooms, but it is not yet clear whether their positive impacts are realized equally across all students. This study explores whether nuanced identity categories from open-ended self-reported data are associated with outcomes in an adaptive learning system for secondary mathematics. As a measure of impact of these social identity data, we correlate student responses for 3 categories: race and ethnicity, gender, and learning identity—a category combining student status and orientation toward learning—and total lessons completed in an adaptive learning system over one academic year. Results show the value of emergent and novel identity categories when measuring student outcomes, as learning identity was positively correlated with mathematics outcomes across two statistical tests.more » « lessFree, publicly-accessible full text available July 21, 2026
-
Mills, Caitlin; Alexandron, Giora; Taibi, Davide; Lo_Bosco, Giosuè; Paquette, Luc (Ed.)Recent research on more comprehensive models of student learning in adaptive math learning software used an indicator of student reading ability to predict students' tendencies to engage in behaviors associated with so-called "gaming the system." Using data from Carnegie Learning's MATHia adaptive learning software, we replicate the finding that students likely to experience reading difficulties are more likely to engage in behaviors associated with gaming the system. Using both observational and experimental data, we consider relationships between student reading ability, readability of specific math lessons, and behavior associated with gaming. We identify several readability characteristics of specific content that predict detected gaming behavior, as well as evidence that a prior experiment that targeted enhanced content readability decreased behavior associated with gaming, but only for students that are predicted to be less likely to experience reading difficulties. We suggest avenues for future research to better understand and model behavior of math learners, especially those who may be experiencing reading difficulties while they learn math.more » « lessFree, publicly-accessible full text available July 14, 2026
-
Mills, Caitlin; Alexandron, Giora; Taibi, Davide; Lo_Bosco, Giosuè; Paquette, Luc (Ed.)Students' reading ability affects their outcomes in learning software even outside of reading education, such as in math education, which can result in unexpected and inequitable outcomes. We analyze an adaptive learning software using Bayesian Knowledge Tracing (BKT) to understand how the fairness of the software is impacted when reading ability is not modeled. We tested BKT model fairness by comparing two years of data from 8,549 students who were classified as either "emerging" or "non-emerging" readers (i.e., a measure of reading ability). We found that while BKT was unbiased on average in terms of equal predictive accuracy across groups, specific skills within the adaptive learning software exhibited bias related to reading level. Additionally, there were differences between the first-answer mastery rates of the emerging and non-emerging readers (M=.687 and M=.776, difference CI=[0.075, 0.095]), indicating that emerging reader status is predictive of mastery. Our findings demonstrate significant group differences in BKT models regarding reading ability, exhibiting that it is important to consider—and perhaps even model—reading as a separate skill that differentially influences students' outcomes."]}more » « lessFree, publicly-accessible full text available July 14, 2026
-
Educational data mining has allowed for large improvements in educational outcomes and understanding of educational processes. However, there remains a constant tension between educational data mining advances and protecting student privacy while using educational datasets. Publicly available datasets have facilitated numerous research projects while striving to preserve student privacy via strict anonymization protocols (e.g., k-anonymity); however, little is known about the relationship between anonymization and utility of educational datasets for downstream educational data mining tasks, nor how anonymization processes might be improved for such tasks. We provide a framework for strictly anonymizing educational datasets with a focus on improving downstream performance in common tasks such as student outcome prediction. We evaluate our anonymization framework on five diverse educational datasets with machine learning-based downstream task examples to demonstrate both the effect of anonymization and our means to improve it. Our method improves downstream machine learning accuracy versus baseline data anonymization by 30.59%, on average, by guiding the anonymization process toward strategies that anonymize the least important information while leaving the most valuable information intact.more » « less
-
Feng, Mingyu; Käser, Tanja; Talukdar, Partha (Ed.)Recent research seeks to develop more comprehensive learner models for adaptive learning software. For example, models of reading comprehension built using data from students’ use of adaptive instructional software for mathematics have recently been developed. These models aim to deliver experiences that consider factors related to learning beyond performance in the target domain for instruction. We investigate the extent to which generalization is possible for a recently developed predictive model that seeks to infer students’ reading comprehension ability (as measured by end-of-year standardized test scores) using an introductory learning experience in Carnegie Learning’s MATHia intelligent tutoring system for mathematics. Building on a model learned on data from middle school students in a single school district in a mid-western U.S. state, using that state’s end-of-year English Language Arts (ELA) standardized test score as an outcome, we consider data from a school district in a south-eastern U.S. state as well as that state’s end-of-year ELA standardized test outcome. Generalization is explored by considering prediction performance when training and testing models on data from each of the individual school districts (and for their respective state’s test outcomes) as well as pooling data from both districts together. We conclude with discussion of investigations of some algorithmic fairness characteristics of the learned models. The results suggest that a model trained on data from the smaller of the two school districts considered may achieve greater fairness in its predictions over models trained on data from the other district or both districts, despite broad, overall similarities in some demographic characteristics of the two school districts. This raises interesting questions for future research on generalizing these kinds of models as well as on ensuring algorithmic fairness of resulting models for use in real-world adaptive systems for learning.more » « less
An official website of the United States government

Full Text Available