skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Improving the reproducibility in geoscientific papers: lessons learned from a Hackathon in climate science
Abstract In this paper, we explore the crucial role and challenges of computational reproducibility in geosciences, drawing insights from the Climate Informatics Reproducibility Challenge (CICR) in 2023. The competition aimed at (1) identifying common hurdles to reproduce computational climate science; and (2) creating interactive reproducible publications for selected papers of the Environmental Data Science journal. Based on lessons learned from the challenge, we emphasize the significance of open research practices, mentorship, transparency guidelines, as well as the use of technologies such as executable research objects for the reproduction of geoscientific published research. We propose a supportive framework of tools and infrastructure for evaluating reproducibility in geoscientific publications, with a case study for the climate informatics community. While the recommendations focus on future CIRCs, we expect they would be beneficial for wider umbrella of reproducibility initiatives in geosciences.  more » « less
Award ID(s):
2226453
PAR ID:
10600883
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
Cambridge University Press
Date Published:
Journal Name:
Environmental Data Science
Volume:
4
ISSN:
2634-4602
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The massive surge in the amount of observational field data demands richer and more meaningful collab- oration between data scientists and geoscientists. This document was written by members of the Working Group on Case Studies of the NSF-funded RCN on Intelli- gent Systems Research To Support Geosciences (IS-GEO, https://is-geo.org/) to describe our vision to build and enhance such collaboration through the use of specially- designed benchmark datasets. Benchmark datasets serve as summary descriptions of problem areas, providing a simple interface between disciplines without requiring extensive background knowledge. Benchmark data intend to address a number of overarching goals. First, they are concrete, identifiable, and public, which results in a natural coordination of research efforts across multiple disciplines and institutions. Second, they provide multi- fold opportunities for objective comparison of various algorithms in terms of computational costs, accuracy, utility and other measurable standards, to address a particular question in geoscience. Third, as materials for education, the benchmark data cultivate future human capital and interest in geoscience problems and data science methods. Finally, a concerted effort to produce and publish benchmarks has the potential to spur the development of new data science methods, while provid- ing deeper insights into many fundamental problems in modern geosciences. That is, similarly to the critical role the genomic and molecular biology data archives serve in facilitating the field of bioinformatics, we expect that the proposed geosciences data repository will serve as “catalysts” for the new discicpline of geoinformatics. We describe specifications of a high quality geoscience bench- mark dataset and discuss some of our first benchmark efforts. We invite the Climate Informatics community to join us in creating additional benchmarks that aim to address important climate science problems. 
    more » « less
  2. The massive surge in the amount of observational field data demands richer and more meaningful collab-oration between data scientists and geoscientists. This document was written by members of the Working Group on Case Studies of the NSF-funded RCN on Intelli-gent Systems Research To Support Geosciences (IS-GEO, https:// is-geo.org/ ) to describe our vision to build and enhance such collaboration through the use of specially-designed benchmark datasets. Benchmark datasets serve as summary descriptions of problem areas, providing a simple interface between disciplines without requiring extensive background knowledge. Benchmark data intend to address a number of overarching goals. First, they are concrete, identifiable, and public, which results in a natural coordination of research efforts across multiple disciplines and institutions. Second, they provide multi-fold opportunities for objective comparison of various algorithms in terms of computational costs, accuracy, utility and other measurable standards, to address a particular question in geoscience. Third, as materials for education, the benchmark data cultivate future human capital and interest in geoscience problems and data science methods. Finally, a concerted effort to produce and publish benchmarks has the potential to spur the development of new data science methods, while provid-ing deeper insights into many fundamental problems in modern geosciences. That is, similarly to the critical role the genomic and molecular biology data archives serve in facilitating the field of bioinformatics, we expect that the proposed geosciences data repository will serve as “catalysts” for the new discicpline of geoinformatics. We describe specifications of a high quality geoscience bench-mark dataset and discuss some of our first benchmark efforts. We invite the Climate Informatics community to join us in creating additional benchmarks that aim to address important climate science problems. 
    more » « less
  3. Abstract Neural networks have become increasingly prevalent within the geosciences, although a common limitation of their usage has been a lack of methods to interpret what the networks learn and how they make decisions. As such, neural networks have often been used within the geosciences to most accurately identify a desired output given a set of inputs, with the interpretation of what the network learns used as a secondary metric to ensure the network is making the right decision for the right reason. Neural network interpretation techniques have become more advanced in recent years, however, and we therefore propose that the ultimate objective of using a neural network can also be the interpretation of what the network has learned rather than the output itself. We show that the interpretation of neural networks can enable the discovery of scientifically meaningful connections within geoscientific data. In particular, we use two methods for neural network interpretation called backward optimization and layerwise relevance propagation, both of which project the decision pathways of a network back onto the original input dimensions. To the best of our knowledge, LRP has not yet been applied to geoscientific research, and we believe it has great potential in this area. We show how these interpretation techniques can be used to reliably infer scientifically meaningful information from neural networks by applying them to common climate patterns. These results suggest that combining interpretable neural networks with novel scientific hypotheses will open the door to many new avenues in neural network‐related geoscience research. 
    more » « less
  4. What new questions could ecophysiologists answer if physio-logging research was fully reproducible? We argue that technical debt (computational hurdles resulting from prioritizing short-term goals over long-term sustainability) stemming from insufficient cyberinfrastructure (field-wide tools, standards, and norms for analyzing and sharing data) trapped physio-logging in a scientific silo. This debt stifles comparative biological analyses and impedes interdisciplinary research. Although physio-loggers (e.g., heart rate monitors and accelerometers) opened new avenues of research, the explosion of complex datasets exceeded ecophysiology’s informatics capacity. Like many other scientific fields facing a deluge of complex data, ecophysiologists now struggle to share their data and tools. Adapting to this new era requires a change in mindset, from “data as a noun” (e.g., traits, counts) to “data as a sentence”, where measurements (nouns) are associate with transformations (verbs), parameters (adverbs), and metadata (adjectives). Computational reproducibility provides a framework for capturing the entire sentence. Though usually framed in terms of scientific integrity, reproducibility offers immediate benefits by promoting collaboration between individuals, groups, and entire fields. Rather than a tax on our productivity that benefits some nebulous greater good, reproducibility can accelerate the pace of discovery by removing obstacles and inviting a greater diversity of perspectives to advance science and society. In this article, we 1) describe the computational challenges facing physio-logging scientists and connect them to the concepts of technical debt and cyberinfrastructure , 2) demonstrate how other scientific fields overcame similar challenges by embracing computational reproducibility, and 3) present a framework to promote computational reproducibility in physio-logging, and bio-logging more generally. 
    more » « less
  5. Abstract Methods of explainable artificial intelligence (XAI) are used in geoscientific applications to gain insights into the decision-making strategy of neural networks (NNs), highlighting which features in the input contribute the most to a NN prediction. Here, we discuss our “lesson learned” that the task of attributing a prediction to the input does not have a single solution. Instead, the attribution results depend greatly on the considered baseline that the XAI method utilizes—a fact that has been overlooked in the geoscientific literature. The baseline is a reference point to which the prediction is compared so that the prediction can be understood. This baseline can be chosen by the user or is set by construction in the method’s algorithm—often without the user being aware of that choice. We highlight that different baselines can lead to different insights for different science questions and, thus, should be chosen accordingly. To illustrate the impact of the baseline, we use a large ensemble of historical and future climate simulations forced with the shared socioeconomic pathway 3-7.0 (SSP3-7.0) scenario and train a fully connected NN to predict the ensemble- and global-mean temperature (i.e., the forced global warming signal) given an annual temperature map from an individual ensemble member. We then use various XAI methods and different baselines to attribute the network predictions to the input. We show that attributions differ substantially when considering different baselines, because they correspond to answering different science questions. We conclude by discussing important implications and considerations about the use of baselines in XAI research. Significance StatementIn recent years, methods of explainable artificial intelligence (XAI) have found great application in geoscientific applications, because they can be used to attribute the predictions of neural networks (NNs) to the input and interpret them physically. Here, we highlight that the attributions—and the physical interpretation—depend greatly on the choice of the baseline—a fact that has been overlooked in the geoscientific literature. We illustrate this dependence for a specific climate task, in which a NN is trained to predict the ensemble- and global-mean temperature (i.e., the forced global warming signal) given an annual temperature map from an individual ensemble member. We show that attributions differ substantially when considering different baselines, because they correspond to answering different science questions. 
    more » « less