skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Automated Machine Learning to Evaluate the Information Content of Tropospheric Trace Gas Columns for Fine Particle Estimates Over India: A Modeling Testbed
Abstract India is largely devoid of high‐quality and reliable on‐the‐ground measurements of fine particulate matter (PM2.5). Ground‐level PM2.5concentrations are estimated from publicly available satellite Aerosol Optical Depth (AOD) products combined with other information. Prior research has largely overlooked the possibility of gaining additional accuracy and insights into the sources of PM using satellite retrievals of tropospheric trace gas columns. We evaluate the information content of tropospheric trace gas columns for PM2.5estimates over India within a modeling testbed using an Automated Machine Learning (AutoML) approach, which selects from a menu of different machine learning tools based on the data set. We then quantify the relative information content of tropospheric trace gas columns, AOD, meteorological fields, and emissions for estimating PM2.5over four Indian sub‐regions on daily and monthly time scales. Our findings suggest that, regardless of the specific machine learning model assumptions, incorporating trace gas modeled columns improves PM2.5estimates. We use the ranking scores produced from the AutoML algorithm and Spearman’s rank correlation to infer or link the possible relative importance of primary versus secondary sources of PM2.5as a first step toward estimating particle composition. Our comparison of AutoML‐derived models to selected baseline machine learning models demonstrates that AutoML is at least as good as user‐chosen models. The idealized pseudo‐observations (chemical‐transport model simulations) used in this work lay the groundwork for applying satellite retrievals of tropospheric trace gases to estimate fine particle concentrations in India and serve to illustrate the promise of AutoML applications in atmospheric and environmental research.  more » « less
Award ID(s):
2020677
PAR ID:
10484001
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ;
Publisher / Repository:
JAMES
Date Published:
Journal Name:
Journal of Advances in Modeling Earth Systems
Volume:
15
Issue:
3
ISSN:
1942-2466
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Introduction:Traditional methods to estimate exposure to PM2.5(particulate matter with less than 2.5 µm in diameter) have typically relied on limited regulatory monitors and do not consider human mobility and travel. However, the limited spatial coverage of regulatory monitors and the lack of consideration of mobility limit the ability to capture actual air pollution exposure. Methods:This study aims to improve traditional exposure assessment methods for PM2.5by incorporating the measurements from a low-cost sensor network (PurpleAir) and regulatory monitors, an automated machine learning modeling framework, and big human mobility data. We develop a monthly-aggregated hourly land use regression (LUR) model based on automated machine learning (AutoML) and assess the model performance across eight metropolitan areas within the US. Results:Our results show that integrating low-cost sensor with regulatory monitor measurements generally improves the AutoML-LUR model accuracy and produces higher spatial variation in PM2.5concentration maps compared to using regulatory monitor measurements alone. Feature importance analysis shows factors highly correlated with PM2.5concentrations, including satellite aerosol optical depth, meteorological variables, vegetation, and land use. In addition, we incorporate human mobility data on exposure estimates regarding where people visit to identify spatiotemporal hotspots of places with higher risks of exposure, emphasizing the need to consider both visitor numbers and PM2.5concentrations when developing exposure reduction strategies. Discussion:This research provides important insights for further public health studies on air pollution by comprehensively assessing the performance of AutoML-LUR models and incorporating human mobility into considering human exposure to air pollution. 
    more » « less
  2. Abstract. The climatic and health effects of aerosols are strongly dependent on the intra-annual variations in their loading and properties. While the seasonal variations of regional aerosol optical depth (AOD) have been extensively studied, understanding the temporal variations in aerosol vertical distribution and particle types is also important for an accurate estimate of aerosol climatic effects. In this paper, we combine the observations from four satellite-borne sensors and several ground-based networks to investigate the seasonal variations of aerosol column loading, vertical distribution, and particle types over three populous regions: the Eastern United States (EUS), Western Europe (WEU), and Eastern and Central China (ECC). In all three regions, column AOD, as well as AOD at heights above 800m, peaks in summer/spring, probably due to accelerated formation of secondary aerosols and hygroscopic growth. In contrast, AOD below 800m peaks in winter over WEU and ECC regions because more aerosols are confined to lower heights due to the weaker vertical mixing. In the EUS region, AOD below 800m shows two maximums, one in summer and the other in winter. The temporal trends in low-level AOD are consistent with those in surface fine particle (PM2.5) concentrations. AOD due to fine particles ( < 0.7µm diameter) is much larger in spring/summer than in winter over all three regions. However, the coarse mode AOD ( > 1.4µm diameter), generally shows small variability, except that a peak occurs in spring in the ECC region due to the prevalence of airborne dust during this season. When aerosols are classified according to sources, the dominant type is associated with anthropogenic air pollution, which has a similar seasonal pattern as total AOD. Dust and sea-spray aerosols in the WEU region peak in summer and winter, respectively, but do not show an obvious seasonal pattern in the EUS region. Smoke aerosols, as well as absorbing aerosols, present an obvious unimodal distribution with a maximum occurring in summer over the EUS and WEU regions, whereas they follow a bimodal distribution with peaks in August and March (due to crop residue burning) over the ECC region. 
    more » « less
  3. Abstract Ambient fine particulate matter (PM2.5) is the world’s leading environmental health risk factor. Quantification is needed of regional contributions to changes in global PM2.5exposure. Here we interpret satellite-derived PM2.5estimates over 1998-2019 and find a reversal of previous growth in global PM2.5air pollution, which is quantitatively attributed to contributions from 13 regions. Global population-weighted (PW) PM2.5exposure, related to both pollution levels and population size, increased from 1998 (28.3 μg/m3) to a peak in 2011 (38.9 μg/m3) and decreased steadily afterwards (34.7 μg/m3in 2019). Post-2011 change was related to exposure reduction in China and slowed exposure growth in other regions (especially South Asia, the Middle East and Africa). The post-2011 exposure reduction contributes to stagnation of growth in global PM2.5-attributable mortality and increasing health benefits per µg/m3marginal reduction in exposure, implying increasing urgency and benefits of PM2.5mitigation with aging population and cleaner air. 
    more » « less
  4. Abstract Alaskan wildfires have major ecological, social, and economic consequences, but associated health impacts remain unexplored. We estimated cardiorespiratory morbidity associated with wildfire smoke (WFS) fine particulate matter with a diameter less than 2.5 μm (PM2.5) in three major population centers (Anchorage, Fairbanks, and the Matanuska‐Susitna Valley) during the 2015–2019 wildfire seasons. To estimate WFS PM2.5, we utilized data from ground‐based monitors and satellite‐based smoke plume estimates. We implemented time‐stratified case‐crossover analyses with single and distributed lag models to estimate the effect of WFS PM2.5on cardiorespiratory emergency department (ED) visits. On the day of exposure to WFS PM2.5, there was an increased odds of asthma‐related ED visits among 15–65 year olds (OR = 1.12, 95% CI = 1.08, 1.16), people >65 years (OR = 1.15, 95% CI = 1.01, 1.31), among Alaska Native people (OR = 1.16, 95% CI = 1.09, 1.23), and in Anchorage (OR = 1.10, 95% CI = 1.05, 1.15) and Fairbanks (OR = 1.12, 95% CI = 1.07, 1.17). There was an increased risk of heart failure related ED visits for Alaska Native people (Lag Day 5 OR = 1.13, 95% CI = 1.02, 1.25). We found evidence that rural populations may delay seeking care. As the frequency and magnitude of Alaskan wildfires continue to increase due to climate change, understanding the health impacts will be imperative. A nuanced understanding of the effects of WFS on specific demographic and geographic groups facilitates data‐driven public health interventions and fire management protocols that address these adverse health effects. 
    more » « less
  5. Abstract Most fine ambient particulate matter (PM2.5)-based epidemiological models use globalized concentration-response (CR) functions assuming that the toxicity of PM2.5is solely mass-dependent without considering its chemical composition. Although oxidative potential (OP) has emerged as an alternate metric of PM2.5toxicity, the association between PM2.5mass and OP on a large spatial extent has not been investigated. In this study, we evaluate this relationship using 385 PM2.5samples collected from 14 different sites across 4 different continents and using 5 different OP (and cytotoxicity) endpoints. Our results show that the relationship between PM2.5mass vs. OP (and cytotoxicity) is largely non-linear due to significant differences in the intrinsic toxicity, resulting from a spatially heterogeneous chemical composition of PM2.5. These results emphasize the need to develop localized CR functions incorporating other measures of PM2.5properties (e.g., OP) to better predict the PM2.5-attributed health burdens. 
    more » « less