Machine Learning based investigation of the variables affecting summertime lightning frequency over the Southern Great Plains

Shan, Siyu; Allen, Dale; Li, Zhanqing; Pickering, Kenneth; Lapierre, Jeff

doi:10.5194/egusphere-2023-1020

Abstract. Lightning is affected by many factors, many of which are not routinely measured, well understood, or accounted for in physical models. Machine learning (ML) excels in exploring and revealing complex relationships between meteorological variables such as those measured at the South Great Plains (SGP) Atmospheric Radiation Measurement (ARM) site; a site that provides an unprecedented level of detail on atmospheric conditions and clouds. Several commonly used ML models have been applied to analyse the relationship between ARM data and lightning data from the Earth Networks Total Lightning Network (ENTLN) in order to identify important variables affecting lightning occurrence in the vicinity of the SGP site during the summers (June, July, August and September) of 2012 to 2020. Testing various ML models, we found that the Random Forest model is the best predictor among common classifiers. It predicted lightning occurrence with an accuracy of 76.9 % and an area under curve (AUC) of 0.850. Using this model, we further ranked the variables in terms of their effectiveness in predicting lightning and identified geometric cloud thickness, rain rate and convective available potential energy (CAPE) as the most effective predictors. The contrast in meteorological variables between no-lightning and frequent-lightning periods was examined on hours with CAPE values conducive to thunderstorm formation. Besides the variables considered for the ML models, surface variables such as equivalent potential temperature and mid-altitude variables such as minimum equivalent potential temperature have a large contrast between no-lightning and frequent-lightning hours. Finally, a notable positive relationship between intra-cloud (IC) flash fraction and the square root of CAPE was found suggesting that stronger updrafts increase the height of the electrification zone, resulting in fewer flashes reaching the surface and consequently a greater IC flash fraction.

More Like this