NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Temporal Comparisons Involving Paleoclimate Data Assimilation: Challenges & Remedies

https://doi.org/10.1175/JCLI-D-24-0101.1

Emile-Geay, Julien; Hakim, Gregory J; Viens, Frederi; Zhu, Feng; Amrhein, Daniel E (December 2024, Journal of Climate)

Abstract Paleoclimate reconstructions are increasingly central to climate assessments, placing recent and future variability in a broader historical context. Paleoclimate reconstructions are increasingly central to climate assessments, placing recent and future variability in a broader historical context. Several estimation methods produce plumes of climate trajectories that practitioners often want to compare to other reconstruction ensembles, or to deterministic trajectories produced by other means, such as global climate models. Of particular interest are “offline” data assimilation (DA) methods, which have recently been adapted to paleoclimatology. Offline DA lacks an explicit model connecting time instants, so its ensemble members are not true system trajectories. This obscures quantitative comparisons, particularly when considering the ensemble mean in isolation. We propose several resampling methods to introduce a priori constraints on temporal behavior, as well as a general notion, called plume distance, to carry out quantitative comparisons between collections of climate trajectories (“plumes”). The plume distance provides a norm in the same physical units as the variable of interest (e.g. °C for temperature), and lends itself to assessments of statistical significance. We apply these tools to four paleoclimate comparisons: (1) global mean surface temperature (GMST) in the online and offline versions of the Last Millennium Reanalysis (v2.1); (2) GMST from these two ensembles to simulations of the Paleoclimate Model Intercomparison Project past1000 ensemble; (3) LMRv2.1 to the PAGES 2k (2019) ensemble of GMST and (4) northern hemisphere mean surface temperature from LMR v2.1 to the Büntgen et al. (2021) ensemble. Results generally show more compatibility between these ensembles than is visually apparent. The proposed methodology is implemented in an open-source Python package, and we discuss possible applications of the plume distance framework beyond paleoclimatology.
more » « less
Free, publicly-accessible full text available December 12, 2025
A pseudoproxy emulation of the PAGES 2k database using a hierarchy of proxy system models

https://doi.org/10.1038/s41597-023-02489-1

Zhu, Feng; Emile-Geay, Julien; Anchukaitis, Kevin J.; McKay, Nicholas P.; Stevenson, Samantha; Meng, Zilu (December 2023, Scientific Data)

Abstract Paleoclimate reconstructions are now integral to climate assessments, yet the consequences of using different methodologies and proxy data require rigorous benchmarking. Pseudoproxy experiments (PPEs) provide a tractable and transparent test bed for evaluating climate reconstruction methods and their sensitivity to aspects of real-world proxy networks. Here we develop a dataset that leverages proxy system models (PSMs) for this purpose, which emulates the essential physical, chemical, biological, and geological processes that translate climate signals into proxy records, making these synthetic proxies more relevant to the real world. We apply a suite of PSMs to emulate the widely-used PAGES 2k dataset, including realistic spatiotemporal sampling and error structure. A hierarchical approach allows us to produce many variants of this base dataset, isolating the impact of sampling bias in time and space, representation error, sampling error, and other assumptions. Combining these various experiments produces a rich dataset (“pseudoPAGES2k”) for many applications. As an illustration, we show how to conduct a PPE with this dataset based on emerging climate field reconstruction techniques.
more » « less
Full Text Available
cfr (v2024.1.26): a Python package for climate field reconstruction

https://doi.org/10.5194/gmd-17-3409-2024

Zhu, Feng; Emile-Geay, Julien; Hakim, Gregory J; Guillot, Dominique; Khider, Deborah; Tardif, Robert; Perkins, Walter A (April 2024, Geoscientific Model Development)

Abstract. Climate field reconstruction (CFR) refers to the estimation of spatiotemporal climate fields (such as surface temperature) from a collection of pointwise paleoclimate proxy datasets. Such reconstructions can provide rich information on climate dynamics and provide an out-of-sample validation of climate models. However, most CFR workflows are complex and time-consuming, as they involve (i) preprocessing of the proxy records, climate model simulations, and instrumental observations; (ii) application of one or more statistical methods; and (iii) analysis and visualization of the reconstruction results. Historically, this process has lacked transparency and accessibility, limiting reproducibility and experimentation by non-specialists. This article presents an open-source and object-oriented Python package called cfr that aims to make CFR workflows easy to understand and conduct, saving climatologists from technical details and facilitating efficient and reproducible research. cfr provides user-friendly utilities for common CFR tasks such as proxy and climate data analysis and visualization, proxy system modeling, and modularized workflows for multiple reconstruction methods, enabling methodological intercomparisons within the same framework. The package is supported with extensive documentation of the application programming interface (API) and a growing number of tutorial notebooks illustrating its usage. As an example, we present two cfr-driven reconstruction experiments using the PAGES 2k temperature database applying the last millennium reanalysis (LMR) paleoclimate data assimilation (PDA) framework and the graphical expectation–maximization (GraphEM) algorithm, respectively.
more » « less
Full Text Available
PaleoRec: A sequential recommender system for the annotation of paleoclimate datasets

https://doi.org/10.1017/eds.2022.3

Manety, Shravya; Khider, Deborah; Heiser, Christopher; McKay, Nicholas; Emile-Geay, Julien; Routson, Cody (January 2022, Environmental Data Science)

Abstract Studying past climate variability is fundamental to our understanding of current changes. In the era of Big Data, the value of paleoclimate information critically depends on our ability to analyze large volume of data, which itself hinges on standardization. Standardization also ensures that these datasets are more Findable, Accessible, Interoperable, and Reusable. Building upon efforts from the paleoclimate community to standardize the format, terminology, and reporting of paleoclimate data, this article describes PaleoRec, a recommender system for the annotation of such datasets. The goal is to assist scientists in the annotation task by reducing and ranking relevant entries in a drop-down menu. Scientists can either choose the best option for their metadata or enter the appropriate information manually. PaleoRec aims to reduce the time to science while ensuring adherence to community standards. PaleoRec is a type of sequential recommender system based on a recurrent neural network that takes into consideration the short-term interest of a user in a particular dataset. The model was developed using 1996 expert-annotated datasets, resulting in 6,512 sequences. The performance of the algorithm, as measured by the Hit Ratio, varies between 0.7 and 1.0. PaleoRec is currently deployed on a web interface used for the annotation of paleoclimate datasets using emerging community standards.
more » « less
Full Text Available

Search for: All records