skip to main content

Title: The ARM Data-Oriented Metrics and Diagnostics Package for Climate Models: A New Tool for Evaluating Climate Models with Field Data
Abstract The U.S. Department of Energy (DOE) Atmospheric Radiation Measurement (ARM) program User Facility produces ground-based long-term continuous unique measurements for atmospheric state, precipitation, turbulent fluxes, radiation, aerosol, cloud, and the land surface, which are collected at multiple sites. These comprehensive datasets have been widely used to calibrate climate models and are proven to be invaluable for climate model development and improvement. This article introduces an evaluation package to facilitate the use of ground-based ARM measurements in climate model evaluation. The ARM data-oriented metrics and diagnostics package (ARM-DIAGS) includes both ARM observational datasets and a Python-based analysis toolkit for computation and visualization. The observational datasets are compiled from multiple ARM data products and specifically tailored for use in climate model evaluation. In addition, ARM-DIAGS also includes simulation data from models participating the Coupled Model Intercomparison Project (CMIP), which will allow climate-modeling groups to compare a new, candidate version of their model to existing CMIP models. The analysis toolkit is designed to make the metrics and diagnostics quickly available to the model developers.
; ; ; ; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
Bulletin of the American Meteorological Society
Page Range or eLocation-ID:
E1619 to E1627
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract A set of diagnostics based on simple, statistical relationships between precipitation and the thermodynamic environment in observations is implemented to assess phase 6 of the Coupled Model Intercomparison Project (CMIP6) model behavior with respect to precipitation. Observational data from the Atmospheric Radiation Measurement (ARM) permanent field observational sites are augmented with satellite observations of precipitation and temperature as an observational baseline. A robust relationship across observational datasets between column water vapor (CWV) and precipitation, in which conditionally averaged precipitation exhibits a sharp pickup at some critical CWV value, provides a useful convective onset diagnostic for climate model comparison. While a few models reproduce an appropriate precipitation pickup, most models begin their pickup at too low CWV and the increase in precipitation with increasing CWV is too weak. Convective transition statistics compiled in column relative humidity (CRH) partially compensate for model temperature biases—although imperfectly since the temperature dependence is more complex than that of column saturation. Significant errors remain in individual models and weak pickups are generally not improved. The conditional-average precipitation as a function of CRH can be decomposed into the product of the probability of raining and mean precipitation during raining times (conditional intensity). The pickup behavior ismore »primarily dependent on the probability of raining near the transition and on the conditional intensity at higher CRH. Most models roughly capture the CRH dependence of these two factors. However, compensating biases often occur: model conditional intensity that is too low at a given CRH is compensated in part by excessive probability of precipitation.« less
  2. Abstract. A comparison of polar stratospheric cloud (PSC) occurrence from 2006 to2010 is presented, as observed from the ground-based lidar station at McMurdo(Antarctica) and by the satellite-borne CALIOP lidar (Cloud-Aerosol Lidarwith Orthogonal Polarization) measuring over McMurdo. McMurdo (Antarctica) isone of the primary lidar stations for aerosol measurements of the NDACC (Network forDetection of Atmospheric Climate Change). The ground-based observations havebeen classified with an algorithm derived from the recent v2 detection andclassification scheme, used to classify PSCs observed by CALIOP.

    A statistical approach has been used to compare ground-based and satellite-based observations, since point-to-point comparison is often troublesome dueto the intrinsic differences in the observation geometries and the imperfectoverlap of the observed areas.

    A comparison of space-borne lidar observations and a selection of simulationsobtained from chemistry–climate models (CCMs) has been made by using a series ofquantitative diagnostics based on the statistical occurrence of different PSCtypes. The distribution of PSCs over Antarctica, calculated by severalCCMVal-2 and CCMI chemistry–climate models has been compared with the PSCcoverage observed by the satellite-borne CALIOP lidar. The use of severaldiagnostic tools, including the temperature dependence of the PSCoccurrences, evidences the merits and flaws of the different models. Thediagnostic methods have been defined to overcome (at least partially)more »thepossible differences due to the resolution of the models and to identifydifferences due to microphysics (e.g., the dependence of PSC occurrence onTTNAT).

    A significant temperature bias of most models has been observed, as well as alimited ability to reproduce the longitudinal variations in PSC occurrencesobserved by CALIOP. In particular, a strong temperature bias has been observedin CCMVal-2 models with a strong impact on PSC formation. The WACCM-CCMI(Whole Atmosphere Community Climate Model – Chemistry-Climate ModelInitiative) model compares rather well with the CALIOP observations, althougha temperature bias is still present.

    « less
  3. Abstract. Due to its remote location and extreme weather conditions, atmospheric in situmeasurements are rare in the Southern Ocean. As a result, aerosol–cloudinteractions in this region are poorly understood and remain a major source ofuncertainty in climate models. This, in turn, contributes substantially topersistent biases in climate model simulations such as the well-known positiveshortwave radiation bias at the surface, as well as biases in numericalweather prediction models and reanalyses. It has been shown in previousstudies that in situ and ground-based remote sensing measurements across theSouthern Ocean are critical for complementing satellite data sets due to theimportance of boundary layer and low-level cloud processes. These processesare poorly sampled by satellite-based measurements and are often obscured bymultiple overlying cloud layers. Satellite measurements also do not constrainthe aerosol–cloud processes very well with imprecise estimation of cloudcondensation nuclei. In this work, we present a comprehensive set of ship-basedaerosol and meteorological observations collected on the 6-weekSouthern Ocean Ross Sea Marine Ecosystem and Environment voyage(TAN1802) voyage of RV Tangaroa across the Southern Ocean, from Wellington, New Zealand, tothe Ross Sea, Antarctica. The voyage was carried out from 8 February to21 March 2018. Many distinct, but contemporaneous, data sets were collectedthroughout the voyage. The compiled data sets include measurements frommore »arange of instruments, such as (i) meteorological conditions at the sea surfaceand profile measurements; (ii) the size and concentration of particles; (iii)trace gases dissolved in the ocean surface such as dimethyl sulfide andcarbonyl sulfide; (iv) and remotely sensed observations of low clouds. Here,we describe the voyage, the instruments, and data processing, and provide a briefoverview of some of the data products available. We encourage the scientificcommunity to use these measurements for further analysis and model evaluationstudies, in particular, for studies of Southern Ocean clouds, aerosol, andtheir interaction. The data sets presented in this study are publiclyavailable at (Kremser et al., 2020).« less
  4. Abstract

    Precipitation sustains life and supports human activities, making its prediction one of the most societally relevant challenges in weather and climate modeling. Limitations in modeling precipitation underscore the need for diagnostics and metrics to evaluate precipitation in simulations and predictions. While routine use of basic metrics is important for documenting model skill, more sophisticated diagnostics and metrics aimed at connecting model biases to their sources and revealing precipitation characteristics relevant to how model precipitation is used are critical for improving models and their uses. This paper illustrates examples of exploratory diagnostics and metrics including 1) spatiotemporal characteristics metrics such as diurnal variability, probability of extremes, duration of dry spells, spectral characteristics, and spatiotemporal coherence of precipitation; 2) process-oriented metrics based on the rainfall–moisture coupling and temperature–water vapor environments of precipitation; and 3) phenomena-based metrics focusing on precipitation associated with weather phenomena including low pressure systems, mesoscale convective systems, frontal systems, and atmospheric rivers. Together, these diagnostics and metrics delineate the multifaceted and multiscale nature of precipitation, its relations with the environments, and its generation mechanisms. The metrics are applied to historical simulations from phases 5 and 6 of the Coupled Model Intercomparison Project. Models exhibit diverse skill as measuredmore »by the suite of metrics, with very few models consistently ranked as top or bottom performers compared to other models in multiple metrics. Analysis of model skill across metrics and models suggests possible relationships among subsets of metrics, motivating the need for more systematic analysis to understand model biases for informing model development.

    « less
  5. Abstract. We present in this technical note the research protocol for phase 4 of theAir Quality Model Evaluation International Initiative (AQMEII4). Thisresearch initiative is divided into two activities, collectively having threegoals: (i) to define the current state of the science with respect torepresentations of wet and especially dry deposition in regional models,(ii) to quantify the extent to which different dry depositionparameterizations influence retrospective air pollutant concentration andflux predictions, and (iii) to identify, through the use of a common set ofdetailed diagnostics, sensitivity simulations, model evaluation, andreduction of input uncertainty, the specific causes for the current range ofthese predictions. Activity 1 is dedicated to the diagnostic evaluation ofwet and dry deposition processes in regional air quality models (describedin this paper), and Activity 2 to the evaluation of dry deposition pointmodels against ozone flux measurements at multiple towers with multiyearobservations (to be described in future submissions as part of the specialissue on AQMEII4). The scope of this paper is to present the scientificprotocols for Activity 1, as well as to summarize the technical informationassociated with the different dry deposition approaches used by theparticipating research groups of AQMEII4. In addition to describing allcommon aspects and data used for this multi-model evaluation activity, mostimportantly, wemore »present the strategy devised to allow a common process-levelcomparison of dry deposition obtained from models using sometimes verydifferent dry deposition schemes. The strategy is based on adding detaileddiagnostics to the algorithms used in the dry deposition modules of existingregional air quality models, in particular archiving diagnostics specific to land use–land cover(LULC) and creating standardized LULC categories tofacilitate cross-comparison of LULC-specific dry deposition parameters andprocesses, as well as archiving effective conductance and effective flux asmeans for comparing the relative influence of different pathways towards thenet or total dry deposition. This new approach, along with an analysis ofprecipitation and wet deposition fields, will provide an unprecedentedprocess-oriented comparison of deposition in regional air quality models.Examples of how specific dry deposition schemes used in participating modelshave been reduced to the common set of comparable diagnostics defined forAQMEII4 are also presented.« less