MLTEing Models: Negotiating, Evaluating, and Documenting Model and System Qualities

Maffey, Katherine R.; Dotterrer, Kyle; Niemann, Jennifer; Cruickshank, Iain; Lewis, Grace A.; Kästner, Christian

doi:10.1109/ICSE-NIER58687.2023.00012

Citation Details

MLTEing Models: Negotiating, Evaluating, and Documenting Model and System Qualities

Many organizations seek to ensure that machine learning (ML) and artificial intelligence (AI) systems work as intended in production but currently do not have a cohesive methodology in place to do so. To fill this gap, we propose MLTE (Machine Learning Test and Evaluation, colloquially referred to as "melt"), a framework and implementation to evaluate ML models and systems. The framework compiles state-of-the-art evaluation techniques into an organizational process for interdisciplinary teams, including model developers, software engineers, system owners, and other stakeholders. MLTE tooling supports this process by providing a domain-specific language that teams can use to express model requirements, an infrastructure to define, generate, and collect ML evaluation metrics, and the means to communicate results. more »

Award ID(s):: 2131477

PAR ID:: 10444834

Author(s) / Creator(s):: Maffey, Katherine R.; Dotterrer, Kyle; Niemann, Jennifer; Cruickshank, Iain; Lewis, Grace A.; Kästner, Christian

Date Published:: 2023-05-01

Journal Name:: 2023 IEEE/ACM 45th International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER)

Page Range / eLocation ID:: 31 to 36

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/ICSE-NIER58687.2023.00012

More Like this