Search for: All records

Creators/Authors contains: "Wirz, Christopher"

« Prev Next »

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

(Re)Conceptualizing trustworthy AI: A foundation for change

https://doi.org/10.1016/j.artint.2025.104309

Wirz, Christopher D; Demuth, Julie L; Bostrom, Ann; Cains, Mariana G; Ebert-Uphoff, Imme; Gagne, David John; Schumacher, Andrea; McGovern, Amy; Madlambayan, Deianna (May 2025, Artificial Intelligence)

Free, publicly-accessible full text available May 1, 2026
National Weather Service (NWS) Forecasters’ Perceptions of AI/ML and Its Use in Operational Forecasting

https://doi.org/10.1175/BAMS-D-24-0044.1

Wirz, Christopher D; Demuth, Julie L; Cains, Mariana G; White, Miranda; Radford, Jacob; Bostrom, Ann (November 2024, Bulletin of the American Meteorological Society)

Abstract Artificial intelligence and machine learning (AI/ML) have attracted a great deal of attention from the atmospheric science community. The explosion of attention on AI/ML development carries implications for the operational community, prompting questions about how novel AI/ML advancements will translate from research into operations. However, the field lacks empirical evidence on how National Weather Service (NWS) forecasters, as key intended users, perceive AI/ML and its use in operational forecasting. This study addresses this crucial gap through structured interviews conducted with 29 NWS forecasters from October 2021 through July 2023 in which we explored their perceptions of AI/ML in forecasting. We found that forecasters generally prefer the term “machine learning” over “artificial intelligence” and that labeling a product as being AI/ML did not hurt perceptions of the products and made some forecasters more excited about the product. Forecasters also had a wide range of familiarity with AI/ML, and overall, they were (tentatively) open to the use of AI/ML in forecasting. We also provide examples of specific areas related to AI/ML that forecasters are excited or hopeful about and that they are concerned or worried about. One concern that was raised in several ways was that AI/ML could replace forecasters or remove them from the forecasting process. However, forecasters expressed a widespread and deep commitment to the best possible forecasts and services to uphold the agency mission using whatever tools or products that are available to assist them. Last, we note how forecasters’ perceptions evolved over the course of the study.
more » « less
Free, publicly-accessible full text available November 1, 2025
Evidential Deep Learning: Enhancing Predictive Uncertainty Estimation for Earth System Science Applications

https://doi.org/10.1175/AIES-D-23-0093.1

Schreck, John S; Gagne, David John; Becker, Charlie; Chapman, William E; Elmore, Kim; Fan, Da; Gantos, Gabrielle; Kim, Eliot; Kimpara, Dhamma; Martin, Thomas; et al (October 2024, Artificial Intelligence for the Earth Systems)

Abstract Robust quantification of predictive uncertainty is a critical addition needed for machine learning applied to weather and climate problems to improve the understanding of what is driving prediction sensitivity. Ensembles of machine learning models provide predictive uncertainty estimates in a conceptually simple way but require multiple models for training and prediction, increasing computational cost and latency. Parametric deep learning can estimate uncertainty with one model by predicting the parameters of a probability distribution but does not account for epistemic uncertainty. Evidential deep learning, a technique that extends parametric deep learning to higher-order distributions, can account for both aleatoric and epistemic uncertainties with one model. This study compares the uncertainty derived from evidential neural networks to that obtained from ensembles. Through applications of the classification of winter precipitation type and regression of surface-layer fluxes, we show evidential deep learning models attaining predictive accuracy rivaling standard methods while robustly quantifying both sources of uncertainty. We evaluate the uncertainty in terms of how well the predictions are calibrated and how well the uncertainty correlates with prediction error. Analyses of uncertainty in the context of the inputs reveal sensitivities to underlying meteorological processes, facilitating interpretation of the models. The conceptual simplicity, interpretability, and computational efficiency of evidential neural networks make them highly extensible, offering a promising approach for reliable and practical uncertainty quantification in Earth system science modeling. To encourage broader adoption of evidential deep learning, we have developed a new Python package, Machine Integration and Learning for Earth Systems (MILES) group Generalized Uncertainty for Earth System Science (GUESS) (MILES-GUESS) (https://github.com/ai2es/miles-guess), that enables users to train and evaluate both evidential and ensemble deep learning. Significance StatementThis study demonstrates a new technique, evidential deep learning, for robust and computationally efficient uncertainty quantification in modeling the Earth system. The method integrates probabilistic principles into deep neural networks, enabling the estimation of both aleatoric uncertainty from noisy data and epistemic uncertainty from model limitations using a single model. Our analyses reveal how decomposing these uncertainties provides valuable insights into reliability, accuracy, and model shortcomings. We show that the approach can rival standard methods in classification and regression tasks within atmospheric science while offering practical advantages such as computational efficiency. With further advances, evidential networks have the potential to enhance risk assessment and decision-making across meteorology by improving uncertainty quantification, a longstanding challenge. This work establishes a strong foundation and motivation for the broader adoption of evidential learning, where properly quantifying uncertainties is critical yet lacking.
more » « less
Full Text Available
Interviews with NWS Forecasters related to severe weather and new artificial intelligence/machine learning (AI/ML) guidance predicting severe hail and storm mode: Pre-interview survey data:Subtitle

https://doi.org/10.17603/ds2-11y2-bg84

Cains, Mariana; Wirz, Christopher; Demuth, Julie; Bostrom, Ann; Harrison, David; McGovern, Amy (January 2024, Designsafe-CI)

This project developed a pre-interview survey, interview protocols, and materials for conducting interviews with expert users to better understand how they assess and make use decisions about new AI/ML guidance. Weather forecasters access and synthesize myriad sources of information when forecasting for high-impact, severe weather events. In recent years, artificial intelligence (AI) techniques have increasingly been used to produce new guidance tools with the goal of aiding weather forecasting, including for severe weather. For this study, we leveraged these advances to explore how National Weather Service (NWS) forecasters perceive the use of new AI guidance for forecasting severe hail and storm mode. We also specifically examine which guidance features are important for how forecasters assess the trustworthiness of new AI guidance. To this aim, we conducted online, structured interviews with NWS forecasters from across the Eastern, Central, and Southern Regions. The interviews covered the forecasters’ approaches and challenges for forecasting severe weather, perceptions of AI and its use in forecasting, and reactions to one of two experimental (i.e., non-operational) AI severe weather guidance: probability of severe hail or probability of storm mode. During the interview, the forecasters went through a self-guided review of different sets of information about the development (spin-up information, AI model technique, training of AI model, input information) and performance (verification metrics, interactive output, output comparison to operational guidance) of the presented guidance. The forecasters then assessed how the information influenced their perception of how trustworthy the guidance was and whether or not they would consider using it for forecasting. This project includes the pre-interview survey, survey data, interview protocols, and accompanying information boards used for the interviews. There is one set of interview materials in which AI/ML are mentioned throughout and another set where AI/ML were only mentioned at the end of the interviews. We did this to better understand how the label “AI/ML” did or did not affect how interviewees responded to interview questions and reviewed the information board. We also leverage think aloud methods with the information board, the instructions for which are included in the interview protocols.
more » « less
Interviews with NWS Forecasters related to severe weather and new artificial intelligence/machine learning (AI/ML) guidance predicting severe hail and storm mode: Pre-interview survey:Subtitle

https://doi.org/10.17603/ds2-mr3z-7947

Demuth, Julie; Bostrom, Ann; Harrison, David; McGovern, Amy; Wirz, Christopher; Cains, Mariana (January 2024, Designsafe-CI)

This project developed a pre-interview survey, interview protocols, and materials for conducting interviews with expert users to better understand how they assess and make use decisions about new AI/ML guidance. Weather forecasters access and synthesize myriad sources of information when forecasting for high-impact, severe weather events. In recent years, artificial intelligence (AI) techniques have increasingly been used to produce new guidance tools with the goal of aiding weather forecasting, including for severe weather. For this study, we leveraged these advances to explore how National Weather Service (NWS) forecasters perceive the use of new AI guidance for forecasting severe hail and storm mode. We also specifically examine which guidance features are important for how forecasters assess the trustworthiness of new AI guidance. To this aim, we conducted online, structured interviews with NWS forecasters from across the Eastern, Central, and Southern Regions. The interviews covered the forecasters’ approaches and challenges for forecasting severe weather, perceptions of AI and its use in forecasting, and reactions to one of two experimental (i.e., non-operational) AI severe weather guidance: probability of severe hail or probability of storm mode. During the interview, the forecasters went through a self-guided review of different sets of information about the development (spin-up information, AI model technique, training of AI model, input information) and performance (verification metrics, interactive output, output comparison to operational guidance) of the presented guidance. The forecasters then assessed how the information influenced their perception of how trustworthy the guidance was and whether or not they would consider using it for forecasting. This project includes the pre-interview survey, survey data, interview protocols, and accompanying information boards used for the interviews. There is one set of interview materials in which AI/ML are mentioned throughout and another set where AI/ML were only mentioned at the end of the interviews. We did this to better understand how the label “AI/ML” did or did not affect how interviewees responded to interview questions and reviewed the information board. We also leverage think aloud methods with the information board, the instructions for which are included in the interview protocols.
more » « less
Interviews with NWS Forecasters related to severe weather and new artificial intelligence/machine learning (AI/ML) guidance predicting severe hail and storm mode: Interview materials for “AI/ML” version:Subtitle

https://doi.org/10.17603/ds2-8mgd-2j44

Cains, Mariana; Wirz, Christopher; Bostrom, Ann; Demuth, Julie; Ebert-Uphoff, Imme; Gagne, David John; McGovern, Amy; Sobash, Ryan; Burke, Amanda (January 2024, Designsafe-CI)

This project developed a pre-interview survey, interview protocols, and materials for conducting interviews with expert users to better understand how they assess and make use decisions about new AI/ML guidance. Weather forecasters access and synthesize myriad sources of information when forecasting for high-impact, severe weather events. In recent years, artificial intelligence (AI) techniques have increasingly been used to produce new guidance tools with the goal of aiding weather forecasting, including for severe weather. For this study, we leveraged these advances to explore how National Weather Service (NWS) forecasters perceive the use of new AI guidance for forecasting severe hail and storm mode. We also specifically examine which guidance features are important for how forecasters assess the trustworthiness of new AI guidance. To this aim, we conducted online, structured interviews with NWS forecasters from across the Eastern, Central, and Southern Regions. The interviews covered the forecasters’ approaches and challenges for forecasting severe weather, perceptions of AI and its use in forecasting, and reactions to one of two experimental (i.e., non-operational) AI severe weather guidance: probability of severe hail or probability of storm mode. During the interview, the forecasters went through a self-guided review of different sets of information about the development (spin-up information, AI model technique, training of AI model, input information) and performance (verification metrics, interactive output, output comparison to operational guidance) of the presented guidance. The forecasters then assessed how the information influenced their perception of how trustworthy the guidance was and whether or not they would consider using it for forecasting. This project includes the pre-interview survey, survey data, interview protocols, and accompanying information boards used for the interviews. There is one set of interview materials in which AI/ML are mentioned throughout and another set where AI/ML were only mentioned at the end of the interviews. We did this to better understand how the label “AI/ML” did or did not affect how interviewees responded to interview questions and reviewed the information board. We also leverage think aloud methods with the information board, the instructions for which are included in the interview protocols.
more » « less
Increasing the Reproducibility and Replicability of Supervised AI/ML in the Earth Systems Science by Leveraging Social Science Methods

https://doi.org/10.1029/2023EA003364

Wirz, Christopher D; Sutter, Carly; Demuth, Julie L; Mayer, Kirsten J; Chapman, William E; Cains, Mariana Goodall; Radford, Jacob; Przybylo, Vanessa; Evans, Aaron; Martin, Thomas; et al (July 2024, Earth and Space Science)

Abstract Artificial intelligence (AI) and machine learning (ML) pose a challenge for achieving science that is both reproducible and replicable. The challenge is compounded in supervised models that depend on manually labeled training data, as they introduce additional decision‐making and processes that require thorough documentation and reporting. We address these limitations by providing an approach to hand labeling training data for supervised ML that integrates quantitative content analysis (QCA)—a method from social science research. The QCA approach provides a rigorous and well‐documented hand labeling procedure to improve the replicability and reproducibility of supervised ML applications in Earth systems science (ESS), as well as the ability to evaluate them. Specifically, the approach requires (a) the articulation and documentation of the exact decision‐making process used for assigning hand labels in a “codebook” and (b) an empirical evaluation of the reliability” of the hand labelers. In this paper, we outline the contributions of QCA to the field, along with an overview of the general approach. We then provide a case study to further demonstrate how this framework has and can be applied when developing supervised ML models for applications in ESS. With this approach, we provide an actionable path forward for addressing ethical considerations and goals outlined by recent AGU work on ML ethics in ESS.
more » « less
Full Text Available
Trust and trustworthy artificial intelligence: A research agenda for AI in the environmental sciences

https://doi.org/10.1111/risa.14245

Bostrom, Ann; Demuth, Julie L; Wirz, Christopher D; Cains, Mariana G; Schumacher, Andrea; Madlambayan, Deianna; Bansal, Akansha Singh; Bearth, Angela; Chase, Randy; Crosman, Katherine M; et al (June 2024, Risk Analysis)

Abstract Demands to manage the risks of artificial intelligence (AI) are growing. These demands and the government standards arising from them both call for trustworthy AI. In response, we adopt a convergent approach to review, evaluate, and synthesize research on the trust and trustworthiness of AI in the environmental sciences and propose a research agenda. Evidential and conceptual histories of research on trust and trustworthiness reveal persisting ambiguities and measurement shortcomings related to inconsistent attention to the contextual and social dependencies and dynamics of trust. Potentially underappreciated in the development of trustworthy AI for environmental sciences is the importance of engaging AI users and other stakeholders, which human–AI teaming perspectives on AI development similarly underscore. Co‐development strategies may also help reconcile efforts to develop performance‐based trustworthiness standards with dynamic and contextual notions of trust. We illustrate the importance of these themes with applied examples and show how insights from research on trust and the communication of risk and uncertainty can help advance the understanding of trust and trustworthiness of AI in the environmental sciences.
more » « less
Full Text Available
Trustworthy Artificial Intelligence for Environmental Sciences: An Innovative Approach for Summer School

https://doi.org/10.1175/BAMS-D-22-0225.1

McGovern, Amy; Gagne, David John; Wirz, Christopher D.; Ebert-Uphoff, Imme; Bostrom, Ann; Rao, Yuhan; Schumacher, Andrea; Flora, Montgomery; Chase, Randy; Mamalakis, Antonios; et al (April 2023, Bulletin of the American Meteorological Society)

Abstract Many of our generation’s most pressing environmental science problems are wicked problems, which means they cannot be cleanly isolated and solved with a single ‘correct’ answer (e.g., Rittel 1973; Wirz 2021). The NSF AI Institute for Research on Trustworthy AI in Weather, Climate, and Coastal Oceanography (AI2ES) seeks to address such problems by developing synergistic approaches with a team of scientists from three disciplines: environmental science (including atmospheric, ocean, and other physical sciences), AI, and social science including risk communication. As part of our work, we developed a novel approach to summer school, held from June 27-30, 2022. The goal of this summer school was to teach a new generation of environmental scientists how to cross disciplines and develop approaches that integrate all three disciplinary perspectives and approaches in order to solve environmental science problems. In addition to a lecture series that focused on the synthesis of AI, environmental science, and risk communication, this year’s summer school included a unique Trust-a-thon component where participants gained hands-on experience applying both risk communication and explainable AI techniques to pre-trained ML models. We had 677 participants from 63 countries register and attend online. Lecture topics included trust and trustworthiness (Day 1), explainability and interpretability (Day 2), data and workflows (Day 3), and uncertainty quantification (Day 4). For the Trust-a-thon we developed challenge problems for three different application domains: (1) severe storms, (2) tropical cyclones, and (3) space weather. Each domain had associated user persona to guide user-centered development.
more » « less
Full Text Available
Ethical and Responsible Use of AI/ML in the Earth, Space, and Environmental Sciences

https://doi.org/10.22541/essoar.168132856.66485758/v1

Stall, Shelley; Cervone, Guido; Coward, Caroline; Cutcher-Gershenfeld, Joel; Donaldson, Thomas J; Erdmann, Chris; Hanson, R Brooks; Holm, Jeanne; King, John Leslie; Lyon, Laura; et al (April 2023, Authorea, Inc.)

Full Text Available

« Prev Next »