skip to main content


Title: Automated Identification of Characteristic Droplet Size Distributions in Stratocumulus Clouds Utilizing a Data Clustering Algorithm
Abstract

Droplet-level interactions in clouds are often parameterized by a modified gamma fitted to a “global” droplet size distribution. Do “local” droplet size distributions of relevance to microphysical processes look like these average distributions? This paper describes an algorithm to search and classify characteristic size distributions within a cloud. The approach combines hypothesis testing, specifically, the Kolmogorov–Smirnov (KS) test, and a widely used class of machine learning algorithms for identifying clusters of samples with similar properties: density-based spatial clustering of applications with noise (DBSCAN) is used as the specific example for illustration. The two-sample KS test does not presume any specific distribution, is parameter free, and avoids biases from binning. Importantly, the number of clusters is not an input parameter of the DBSCAN-type algorithms but is independently determined in an unsupervised fashion. As implemented, it works on an abstract space from the KS test results, and hence spatial correlation is not required for a cluster. The method is explored using data obtained from the Holographic Detector for Clouds (HOLODEC) deployed during the Aerosol and Cloud Experiments in the Eastern North Atlantic (ACE-ENA) field campaign. The algorithm identifies evidence of the existence of clusters of nearly identical local size distributions. It is found that cloud segments have as few as one and as many as seven characteristic size distributions. To validate the algorithm’s robustness, it is tested on a synthetic dataset and successfully identifies the predefined distributions at plausible noise levels. The algorithm is general and is expected to be useful in other applications, such as remote sensing of cloud and rain properties.

Significance Statement

A typical cloud can have billions of drops spread over tens or hundreds of kilometers in space. Keeping track of the sizes, positions, and interactions of all of these droplets is impractical, and, as such, information about the relative abundance of large and small drops is typically quantified with a “size distribution.” Droplets in a cloud interact locally, however, so this work is motivated by the question of whether the cloud droplet size distribution is different in different parts of a cloud. A new method, based on hypothesis testing and machine learning, determines how many different size distributions are contained in a given cloud. This is important because the size distribution describes processes such as cloud droplet growth and light transmission through clouds.

 
more » « less
Award ID(s):
2001490
NSF-PAR ID:
10371559
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
American Meteorological Society
Date Published:
Journal Name:
Artificial Intelligence for the Earth Systems
Volume:
1
Issue:
3
ISSN:
2769-7525
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Water vapor supersaturation in clouds is a random variable that drives activation and growth of cloud droplets. The Pi Convection–Cloud Chamber generates a turbulent cloud with a microphysical steady state that can be varied from clean to polluted by adjusting the aerosol injection rate. The supersaturation distribution and its moments, e.g., mean and variance, are investigated for varying cloud microphysical conditions. High-speed and collocated Eulerian measurements of temperature and water vapor concentration are combined to obtain the temporally resolved supersaturation distribution. This allows quantification of the contributions of variances and covariances between water vapor and temperature. Results are consistent with expectations for a convection chamber, with strong correlation between water vapor and temperature; departures from ideal behavior can be explained as resulting from dry regions on the warm boundary, analogous to entrainment. The saturation ratio distribution is measured under conditions that show monotonic increase of liquid water content and decrease of mean droplet diameter with increasing aerosol injection rate. The change in liquid water content is proportional to the change in water vapor concentration between no-cloud and cloudy conditions. Variability in the supersaturation remains even after cloud droplets are formed, and no significant buffering is observed. Results are interpreted in terms of a cloud microphysical Damköhler number (Da), under conditions corresponding to, i.e., the slow-microphysics regime. This implies that clouds with very clean regions, such thatis satisfied, will experience supersaturation fluctuations without them being buffered by cloud droplet growth.

    Significance Statement

    The saturation ratio (humidity) in clouds controls the growth rate and formation of cloud droplets. When air in a turbulent cloud mixes, the humidity varies in space and time throughout the cloud. This is important because it means cloud droplets experience different growth histories, thereby resulting in broader size distributions. It is often assumed that growth and evaporation of cloud droplets buffers out some of the humidity variations. Measuring these variations has been difficult, especially in the field. The purpose of this study is to measure the saturation ratio distribution in clouds with a range of conditions. We measure the in-cloud saturation ratio using a convection cloud chamber with clean to polluted cloud properties. We found in clouds with low concentrations of droplets that the variations in the saturation ratio are not suppressed.

     
    more » « less
  2. Abstract

    Precipitation efficiency and optical properties of clouds, both central to determining Earth's weather and climate, depend on the size distribution of cloud particles. In this work theoretical expressions for cloud droplet size distribution shape are evaluated using measurements from controlled experiments in a convective‐cloud chamber. The experiments are a unique opportunity to constrain theory because they are in steady‐state and because the initial and boundary conditions are well characterized compared to typical atmospheric measurements. Three theoretical distributions obtained from a Langevin drift‐diffusion approach to cloud formation via stochastic condensation are tested: (a) stochastic condensation with a constant removal time‐scale; (b) stochastic condensation with a size‐dependent removal time‐scale; (c) droplet growth in a fixed supersaturation condition and with size‐dependent removal. In addition, a similar Weibull distribution that can be obtained from the drift‐diffusion approach, as well as from mechanism‐independent probabilistic arguments (e.g., maximum entropy), is tested as a fourth hypothesis. Statistical techniques such as theχ2test, sum of squared errors of prediction, and residual analysis are employed to judge relative success or failure of the theoretical distributions to describe the experimental data. An extensive set of cloud droplet size distributions are measured under different aerosol injection rates. Five different aerosol injection rates are run both for size‐selected aerosol particles, and six aerosol injection rates are run for broad‐distribution, polydisperse aerosol particles. In relative comparison, the most favourable comparison to the measurements is the expression for stochastic condensation with size‐dependent droplet removal rate. However, even this optimal distribution breaks down for broad aerosol size distributions, primarily due to deviations from the measured large‐droplet tail. A possible explanation for the deviation is the Ostwald ripening effect coupled with deactivation/activation in polluted cloud conditions.

     
    more » « less
  3. Abstract

    Data collected with a holographic instrument [Holographic Detector for Clouds (HOLODEC)] on board the High-Performance Instrumented Airborne Platform for Environmental Research Gulfstream-V (HIAPER GV) aircraft from marine stratocumulus clouds during the Cloud System Evolution in the Trades (CSET) field project are examined for spatial uniformity. During one flight leg at 1190 m altitude, 1816 consecutive holograms were taken, which were approximately 40 m apart with individual hologram dimensions of 1.16 cm × 0.68 cm × 12.0 cm and with droplet concentrations of up to 500 cm−3. Unlike earlier studies, minimally intrusive data processing (e.g., bypassing calculation of number concentrations, binning, and parametric fitting) is used to test for spatial uniformity of clouds on intra- and interhologram spatial scales (a few centimeters and 40 m, respectively). As a means to test this, measured droplet count fluctuations are normalized with the expected standard deviation from theoretical Poisson distributions, which signifies randomness. Despite the absence of trends in the mean concentration, it is found that the null hypothesis of spatial uniformity on both spatial scales can be rejected with compelling statistical confidence. Monte Carlo simulations suggest that weak clustering explains this signature. These findings also hold for size-resolved analysis but with less certainty. Clustering of droplets caused by, for example, entrainment and turbulence, is size dependent and is likely to influence key processes such as droplet growth and thus cloud lifetime.

     
    more » « less
  4. Abstract

    Recent advances in hail trajectory modeling regularly produce datasets containing millions of hail trajectories. Because hail growth within a storm cannot be entirely separated from the structure of the trajectories producing it, a method to condense the multidimensionality of the trajectory information into a discrete number of features analyzable by humans is necessary. This article presents a three-dimensional trajectory clustering technique that is designed to group trajectories that have similar updraft-relative structures and orientations. The new technique is an application of a two-dimensional method common in the data mining field. Hail trajectories (or “parent” trajectories) are partitioned into segments before they are clustered using a modified version of the density-based spatial applications with noise (DBSCAN) method. Parent trajectories with segments that are members of at least two common clusters are then grouped into parent trajectory clusters before output. This multistep method has several advantages. Hail trajectories with structural similarities along only portions of their length, e.g., sourced from different locations around the updraft before converging to a common pathway, can still be grouped. However, the physical information inherent in the full length of the trajectory is retained, unlike methods that cluster trajectory segments alone. The conversion of trajectories to an updraft-relative space also allows trajectories separated in time to be clustered. Once the final output trajectory clusters are identified, a method for calculating a representative trajectory for each cluster is proposed. Cluster distributions of hailstone and environmental characteristics at each time step in the representative trajectory can also be calculated.

    Significance Statement

    To understand how a storm produces large hail, we need to understand the paths that hailstones take in a storm when growing. We can simulate these paths using computer models. However, the millions of hailstones in a simulated storm create millions of paths, which is hard to analyze. This article describes a machine learning method that groups together hailstone paths based on how similar their three-dimensional structures look. It will let hail scientists analyze hailstone pathways in storms more easily, and therefore better understand how hail growth happens.

     
    more » « less
  5. Abstract. Microphysical processes are important for the development of clouds and thus Earth's climate. For example, turbulent fluctuations in the water vapor mixing ratio, r, and temperature, T, cause fluctuations in the saturation ratio, S. Because S is the driving factor in the condensational growth of droplets, fluctuations may broaden the cloud droplet size distribution due to individual droplets experiencing different growth rates. The small-scale turbulent fluctuations in the atmosphere that are relevant to cloud droplets are difficult to quantify through field measurements. We investigate these processes in the laboratory using Michigan Tech's Π Chamber. The Π Chamber utilizes Rayleigh–Bénard convection (RBC) to create the turbulent conditions inherent in clouds. In RBC it is common for a large-scale circulation (LSC) to form. As a consequence of the LSC, the temperature field of the chamber is not spatially uniform. In this paper, we characterize the LSC in the Π Chamber and show how it affects the shape of the distributions of r, T, and S. The LSC was found to follow a single roll with an updraft and downdraft along opposing walls of the chamber. Near the updraft (downdraft), the distributions of T and r were positively (negatively) skewed. At each measuring position, S consistently had a negatively skewed distribution, with the downdraft being the most negative. 
    more » « less