NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Regulating Explainability in Machine Learning Applications -- Observations from a Policy Design Experiment

https://doi.org/10.1145/3630106.3659028

Nahar, Nadia; Rowlett, Jenny; Bray, Matthew; Omar, Zahra Abba; Papademetris, Xenophon; Menon, Alka; Kästner, Christian (June 2024, ACM)

Full Text Available
A Meta-Summary of Challenges in Building Products with ML Components – Collecting Experiences from 4758+ Practitioners

https://doi.org/10.1109/CAIN58948.2023.00034

Nahar, Nadia; Zhang, Haoran; Lewis, Grace; Zhou, Shurui; Kästner, Christian (May 2023, 2023 IEEE/ACM 2nd International Conference on AI Engineering – Software Engineering for AI (CAIN))

Incorporating machine learning (ML) components into software products raises new software-engineering challenges and exacerbates existing ones. Many researchers have invested significant effort in understanding the challenges of industry practitioners working on building products with ML components, through interviews and surveys with practitioners. With the intention to aggregate and present their collective findings, we conduct a meta-summary study: We collect 50 relevant papers that together interacted with over 4758 practitioners using guidelines for systematic literature reviews. We then collected, grouped, and organized the over 500 mentions of challenges within those papers. We highlight the most commonly reported challenges and hope this meta-summary will be a useful resource for the research community to prioritize research and education in this field.
more » « less
Full Text Available
MLTEing Models: Negotiating, Evaluating, and Documenting Model and System Qualities

https://doi.org/10.1109/ICSE-NIER58687.2023.00012

Maffey, Katherine R.; Dotterrer, Kyle; Niemann, Jennifer; Cruickshank, Iain; Lewis, Grace A.; Kästner, Christian (May 2023, 2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER))

Many organizations seek to ensure that machine learning (ML) and artificial intelligence (AI) systems work as intended in production but currently do not have a cohesive methodology in place to do so. To fill this gap, we propose MLTE (Machine Learning Test and Evaluation, colloquially referred to as "melt"), a framework and implementation to evaluate ML models and systems. The framework compiles state-of-the-art evaluation techniques into an organizational process for interdisciplinary teams, including model developers, software engineers, system owners, and other stakeholders. MLTE tooling supports this process by providing a domain-specific language that teams can use to express model requirements, an infrastructure to define, generate, and collect ML evaluation metrics, and the means to communicate results.
more » « less
Full Text Available
Aspirations and Practice of ML Model Documentation: Moving the Needle with Nudging and Traceability

https://doi.org/10.1145/3544548.3581518

Bhat, Avinash; Coursey, Austin; Hu, Grace; Li, Sixian; Nahar, Nadia; Zhou, Shurui; Kästner, Christian; Guo, Jin L.C. (April 2023, CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems)

The documentation practice for machine-learned (ML) models often falls short of established practices for traditional software, which impedes model accountability and inadvertently abets inappropriate or misuse of models. Recently, model cards, a proposal for model documentation, have attracted notable attention, but their impact on the actual practice is unclear. In this work, we systematically study the model documentation in the field and investigate how to encourage more responsible and accountable documentation practice. Our analysis of publicly available model cards reveals a substantial gap between the proposal and the practice. We then design a tool named DocML aiming to (1) nudge the data scientists to comply with the model cards proposal during the model development, especially the sections related to ethics, and (2) assess and manage the documentation quality. A lab study reveals the benefit of our tool towards long-term documentation quality and accountability.
more » « less
Full Text Available
Beyond Testers’ Biases: Guiding Model Testing with Knowledge Bases using LLMs

https://doi.org/10.18653/v1/2023.findings-emnlp.901

Yang, Chenyang; Rustogi, Rishabh; Brower-Sinning, Rachel; Lewis, Grace; Kaestner, Christian; Wu, Tongshuang (January 2023, Association for Computational Linguistics)

Full Text Available
Data Leakage in Notebooks: Static Detection and Better Processes

https://doi.org/10.1145/3551349.3556918

Yang, Chenyang; Brower-Sinning, Rachel A; Lewis, Grace; Kaestner, Christian (October 2022, ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering)

Data science pipelines to train and evaluate models with machine learning may contain bugs just like any other code. Leakage between training and test data can lead to overestimating the model’s accuracy during offline evaluations, possibly leading to deployment of low-quality models in production. Such leakage can happen easily by mistake or by following poor practices, but may be tedious and challenging to detect manually. We develop a static analysis approach to detect common forms of data leakage in data science code. Our evaluation shows that our analysis accurately detects data leakage and that such leakage is pervasive among over 100,000 analyzed public notebooks. We discuss how our static analysis approach can help both practitioners and educators, and how leakage prevention can be designed into the development process.
more » « less
Full Text Available
Elevating Jupyter Notebook Maintenance Tooling by Identifying and Extracting Notebook Structures

https://doi.org/10.1109/ICSME55016.2022.00047

Jiang, Yuan; Kastner, Christian; Zhou, Shurui (October 2022, 2022 IEEE International Conference on Software Maintenance and Evolution (ICSME))

Data analysis is an exploratory, interactive, and often collaborative process. Computational notebooks have become a popular tool to support this process, among others because of their ability to interleave code, narrative text, and results. However, notebooks in practice are often criticized as hard to maintain and being of low code quality, including problems such as unused or duplicated code and out-of-order code execution. Data scientists can benefit from better tool support when maintaining and evolving notebooks. We argue that central to such tool support is identifying the structure of notebooks. We present a lightweight and accurate approach to extract notebook structure and outline several ways such structure can be used to improve maintenance tooling for notebooks, including navigation and finding alternatives.
more » « less
Full Text Available
Collaboration challenges in building ML-enabled systems: communication, documentation, engineering, and process

https://doi.org/10.1145/3510003.3510209

Nahar, Nadia; Zhou, Shurui; Lewis, Grace; Kästner, Christian (May 2022, ICSE '22: Proceedings of the 44th International Conference on Software Engineering)

The introduction of machine learning (ML) components in software projects has created the need for software engineers to collaborate with data scientists and other specialists. While collaboration can always be challenging, ML introduces additional challenges with its exploratory model development process, additional skills and knowledge needed, difficulties testing ML systems, need for continuous evolution and monitoring, and non-traditional quality requirements such as fairness and explainability. Through interviews with 45 practitioners from 28 organizations, we identified key collaboration challenges that teams face when building and deploying ML systems into production. We report on common collaboration points in the development of production ML systems for requirements, data, and integration, as well as corresponding team patterns and challenges. We find that most of these challenges center around communication, documentation, engineering, and process, and collect recommendations to address these challenges.
more » « less
Full Text Available
Feature Interactions on Steroids: On the Composition of ML Models

https://doi.org/10.1109/MS.2021.3134386

Apel, Sven; Kastner, Christian; Kang, Eunsuk (May 2022, IEEE Software)

Full Text Available

Search for: All records