NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A Meta-Summary of Challenges in Building Products with ML Components – Collecting Experiences from 4758+ Practitioners

https://doi.org/10.1109/CAIN58948.2023.00034

Nahar, Nadia; Zhang, Haoran; Lewis, Grace; Zhou, Shurui; Kästner, Christian (May 2023, 2023 IEEE/ACM 2nd International Conference on AI Engineering – Software Engineering for AI (CAIN))

Incorporating machine learning (ML) components into software products raises new software-engineering challenges and exacerbates existing ones. Many researchers have invested significant effort in understanding the challenges of industry practitioners working on building products with ML components, through interviews and surveys with practitioners. With the intention to aggregate and present their collective findings, we conduct a meta-summary study: We collect 50 relevant papers that together interacted with over 4758 practitioners using guidelines for systematic literature reviews. We then collected, grouped, and organized the over 500 mentions of challenges within those papers. We highlight the most commonly reported challenges and hope this meta-summary will be a useful resource for the research community to prioritize research and education in this field.
more » « less
Full Text Available
MLTEing Models: Negotiating, Evaluating, and Documenting Model and System Qualities

https://doi.org/10.1109/ICSE-NIER58687.2023.00012

Maffey, Katherine R.; Dotterrer, Kyle; Niemann, Jennifer; Cruickshank, Iain; Lewis, Grace A.; Kästner, Christian (May 2023, 2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER))

Many organizations seek to ensure that machine learning (ML) and artificial intelligence (AI) systems work as intended in production but currently do not have a cohesive methodology in place to do so. To fill this gap, we propose MLTE (Machine Learning Test and Evaluation, colloquially referred to as "melt"), a framework and implementation to evaluate ML models and systems. The framework compiles state-of-the-art evaluation techniques into an organizational process for interdisciplinary teams, including model developers, software engineers, system owners, and other stakeholders. MLTE tooling supports this process by providing a domain-specific language that teams can use to express model requirements, an infrastructure to define, generate, and collect ML evaluation metrics, and the means to communicate results.
more » « less
Full Text Available
Beyond Testers’ Biases: Guiding Model Testing with Knowledge Bases using LLMs

https://doi.org/10.18653/v1/2023.findings-emnlp.901

Yang, Chenyang; Rustogi, Rishabh; Brower-Sinning, Rachel; Lewis, Grace; Kaestner, Christian; Wu, Tongshuang (January 2023, Association for Computational Linguistics)

Full Text Available
Data Leakage in Notebooks: Static Detection and Better Processes

https://doi.org/10.1145/3551349.3556918

Yang, Chenyang; Brower-Sinning, Rachel A; Lewis, Grace; Kaestner, Christian (October 2022, ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering)

Data science pipelines to train and evaluate models with machine learning may contain bugs just like any other code. Leakage between training and test data can lead to overestimating the model’s accuracy during offline evaluations, possibly leading to deployment of low-quality models in production. Such leakage can happen easily by mistake or by following poor practices, but may be tedious and challenging to detect manually. We develop a static analysis approach to detect common forms of data leakage in data science code. Our evaluation shows that our analysis accurately detects data leakage and that such leakage is pervasive among over 100,000 analyzed public notebooks. We discuss how our static analysis approach can help both practitioners and educators, and how leakage prevention can be designed into the development process.
more » « less
Full Text Available
Collaboration challenges in building ML-enabled systems: communication, documentation, engineering, and process

https://doi.org/10.1145/3510003.3510209

Nahar, Nadia; Zhou, Shurui; Lewis, Grace; Kästner, Christian (May 2022, ICSE '22: Proceedings of the 44th International Conference on Software Engineering)

The introduction of machine learning (ML) components in software projects has created the need for software engineers to collaborate with data scientists and other specialists. While collaboration can always be challenging, ML introduces additional challenges with its exploratory model development process, additional skills and knowledge needed, difficulties testing ML systems, need for continuous evolution and monitoring, and non-traditional quality requirements such as fairness and explainability. Through interviews with 45 practitioners from 28 organizations, we identified key collaboration challenges that teams face when building and deploying ML systems into production. We report on common collaboration points in the development of production ML systems for requirements, data, and integration, as well as corresponding team patterns and challenges. We find that most of these challenges center around communication, documentation, engineering, and process, and collect recommendations to address these challenges.
more » « less
Full Text Available
The Role of Edge Offload for Hardware-Accelerated Mobile Devices

https://doi.org/10.1145/3446382.3448360

Satyanarayanan, Mahadev; Beckmann, Nathan; Lewis, Grace A.; Lucia, Brandon (February 2021, HotMobile '21: Proceedings of the 22nd International Workshop on Mobile Computing Systems and Applications)

This position paper examines a spectrum of approaches to overcoming the limited computing power of mobile devices caused by their need to be small, lightweight and energy efficient. At one extreme is offloading of compute-intensive operations to a cloudlet nearby. At the other extreme is the use of fixed-function hardware accelerators on mobile devices. Between these endpoints lie various configurations of programmable hardware accelerators. We explore the strengths and weaknesses of these approaches and conclude that they are, in fact, complementary. Based on this insight, we advocate a software-hardware co-evolution path that combines their strengths.
more » « less

Search for: All records