skip to main content


Search for: All records

Award ID contains: 2000405

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    Automated, data‐driven decision making is increasingly common in a variety of application domains. In educational software, for example, machine learning has been applied to tasks like selecting the next exercise for students to complete. Machine learning methods, however, are not always equally effective for all groups of students. Current approaches to designing fair algorithms tend to focus on statistical measures concerning a small subset of legally protected categories like race or gender. Focusing solely on legally protected categories, however, can limit our understanding of bias and unfairness by ignoring the complexities of identity. We propose an alternative approach to categorization, grounded in sociological techniques of measuring identity. By soliciting survey data and interviews from the population being studied, we can build context‐specific categories from the bottom up. The emergent categories can then be combined with extant algorithmic fairness strategies to discover which identity groups are not well‐served, and thus where algorithms should be improved or avoided altogether. We focus on educational applications but present arguments that this approach should be adopted more broadly for issues of algorithmic fairness across a variety of applications.

     
    more » « less
  2. Research into "gaming the system" behavior in intelligent tutoring systems (ITS) has been around for almost two decades, and detection has been developed for many ITSs. Machine learning models can detect this behavior in both real-time and in historical data. However, intelligent tutoring system designs often change over time, in terms of the design of the student interface, assessment models, and data collection log schemas. Can gaming detectors still be trusted, a decade or more after they are developed? In this research, we evaluate the robustness/degradation of gaming detectors when trained on older data logs and evaluated on current data logs. We demonstrate that some machine learning models developed using past data are still able to predict gaming behavior from student data collected 16 years later, but that there is considerable variance in how well different algorithms perform over time. We demonstrate that a classic decision tree algorithm maintained its performance while more contemporary algorithms struggled to transfer to new data, even though they exhibited better performance on unseen students in both New and Old data sets by themselves. Examining the feature importance values provides some explanation for the differences in performance between models, and offers some insight into how we might safeguard against detector rot over time. 
    more » « less