skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Malik, Shaivi"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Data leakage remains a pervasive issue in machine learning (ML), especially when applied to science, leading to overly optimistic performance estimates and irreproducible findings. Despite its prevalence, data leakage receives limited attention in ML education, in part due to the lack of accessible, hands-on teaching resources. To address this gap, we developed interactive learning modules in which students reproduce examples from academic publications that are affected by data leakage, then repeat the evaluation without the data leakage error to see how the finding is affected. These modules were deployed by the authors in two introductory machine learning courses, enabling students to explore common forms of leakage and their impact on model reliability. Following their engagement with these materials, student feedback highlighted increased awareness of subtle pitfalls that can compromise machine learning workflows. 
    more » « less
    Free, publicly-accessible full text available July 29, 2026