skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: PM2.5 Modeling and Historical Reconstruction over the Continental USA Utilizing GOES-16 AOD
In this study, we present a nationwide machine learning model for hourly PM2.5 estimation for the continental United States (US) using high temporal resolution Geostationary Operational Environmental Satellites (GOES-16) Aerosol Optical Depth (AOD) data, meteorological variables from the European Center for Medium Range Weather Forecasting (ECMWF) and ancillary data collected between May 2017 and December 2020. A model sensitivity analysis was conducted on predictor variables to determine the optimal model. It turns out that GOES16 AOD, variables from ECMWF, and ancillary data are effective variables in PM2.5 estimation and historical reconstruction, which achieves an average mean absolute error (MAE) of 3.0 μg/m3, and a root mean square error (RMSE) of 5.8 μg/m3. This study also found that the model performance as well as the site measured PM2.5 concentrations demonstrate strong spatial and temporal patterns. Specifically, in the temporal scale, the model performed best between 8:00 p.m. and 11:00 p.m. (UTC TIME) and had the highest coefficient of determination (R2) in Autumn and the lowest MAE and RMSE in Spring. In the spatial scale, the analysis results based on ancillary data show that the R2 scores correlate positively with the mean measured PM2.5 concentration at monitoring sites. Mean measured PM2.5 concentrations are positively correlated with population density and negatively correlated with elevation. Water, forests, and wetlands are associated with low PM2.5 concentrations, whereas developed, cultivated crops, shrubs, and grass are associated with high PM2.5 concentrations. In addition, the reconstructed PM2.5 surfaces serve as an important data source for pollution event tracking and PM2.5 analysis. For this purpose, from May 2017 to December 2020, hourly PM2.5 estimates were made for 10 km by 10 km and the PM2.5 estimates from August through November 2020 during the period of California Santa Clara Unite (SCU) Lightning Complex fires are presented. Based on the quantitative and visualization results, this study reveals that a number of large wildfires in California had a profound impact on the value and spatial-temporal distributions of PM2.5 concentrations.  more » « less
Award ID(s):
2019135
PAR ID:
10347633
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Remote Sensing
Volume:
13
Issue:
23
ISSN:
2072-4292
Page Range / eLocation ID:
4788
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Short-term exposure to fine particulate matter (PM2.5) pollution is linked to numerous adverse health effects. Pollution episodes, such as wildfires, can lead to substantial increases in PM2.5 levels. However, sparse regulatory measurements provide an incomplete understanding of pollution gradients. Here, we demonstrate an infrastructure that integrates community-based measurements from a network of low-cost PM2.5 sensors with rigorous calibration and a Gaussian process model to understand neighborhood-scale PM2.5 concentrations during three pollution episodes (July 4, 2018, fireworks; July 5 and 6, 2018, wildfire; Jan 3−7, 2019, persistent cold air pool, PCAP). The firework/wildfire events included 118 sensors in 84 locations, while the PCAP event included 218 sensors in 138 locations. The model results accurately predict reference measurements during the fireworks (n: 16, hourly root-mean-square error, RMSE, 12.3−21.5 μg/m3, n(normalized)-RMSE: 9−24%), the wildfire (n: 46, RMSE: 2.6−4.0 μg/m3; nRMSE: 13.1−22.9%), and the PCAP (n: 96, RMSE: 4.9−5.7 μg/m3; nRMSE: 20.2−21.3%). They also revealed dramatic geospatial differences in PM2.5 concentrations that are not apparent when only considering government measurements or viewing the US Environmental Protection Agency’s AirNow’s visualizations. Complementing the PM2.5 estimates and visualizations are highly resolved uncertainty maps. Together, these results illustrate the potential for low-cost sensor networks that combined with a data-fusion algorithm and appropriate calibration and training can dynamically and with improved accuracy estimate PM2.5 concentrations during pollution episodes. These highly resolved uncertainty estimates can provide a much-needed strategy to communicate uncertainty to end users. 
    more » « less
  2. Water quality is affected by multiple spatial and temporal factors, including the surrounding land characteristics, human activities, and antecedent precipitation amounts. However, identifying the relationships between water quality and spatially and temporally varying environmental variables with a machine learning technique in a heterogeneous urban landscape has been understudied. We explore how seasonal and variable precipitation amounts and other small-scale landscape variables affect E. coli, total suspended solids (TSS), nitrogen-nitrate, orthophosphate, lead, and zinc concentrations in Portland, Oregon, USA. Mann–Whitney tests were used to detect differences in water quality between seasons and COVID-19 periods. Spearman’s rank correlation analysis was used to identify the relationship between water quality and explanatory variables. A Random Forest (RF) model was used to predict water quality using antecedent precipitation amounts and landscape variables as inputs. The performance of RF was compared with that of ordinary least squares (OLS). Mann–Whitney tests identified statistically significant differences in all pollutant concentrations (except TSS) between the wet and dry seasons. Nitrate was the only pollutant to display statistically significant reductions in median concentrations (from 1.5 mg/L to 1.04 mg/L) during the COVID-19 lockdown period, likely associated with reduced traffic volumes. Spearman’s correlation analysis identified the highest correlation coefficients between one-day precipitation amounts and E. coli, lead, zinc, and TSS concentrations. Road length is positively associated with E. coli and zinc. The Random Forest (RF) model best predicts orthophosphate concentrations (R2 = 0.58), followed by TSS (R2 = 0.54) and nitrate (R2 = 0.46). E. coli was the most difficult to model and had the highest RMSE, MAE, and MAPE values. Overall, the Random Forest model outperformed OLS, as evaluated by RMSE, MAE, MAPE, and R2. The Random Forest was an effective approach to modeling pollutant concentrations using both categorical seasonal and COVID data along with continuous rain and landscape variables to predict water quality in urban streams. Implementing optimization techniques can further improve the model’s performance and allow researchers to use a machine learning approach for water quality modeling. 
    more » « less
  3. In urban areas like Chicago, daily life extends above ground level due to the prevalence of high-rise buildings where residents and commuters live and work. This study examines the variation in fine particulate matter (PM2.5) concentrations across building stories. PM2.5 levels were measured using PurpleAir sensors, installed between 8 April and 7 May 2023, on floors one, four, six, and nine of an office building in Chicago. Additionally, data were collected from a public outdoor PurpleAir sensor on the fourteenth floor of a condominium located 800 m away. The results show that outdoor PM2.5 concentrations peak at 14 m height, and then decline by 0.11 μg/m3 per meter elevation, especially noticeable from midnight to 8 a.m. under stable atmospheric conditions. Indoor PM2.5 concentrations increase steadily by 0.02 μg/m3 per meter elevation, particularly during peak work hours, likely caused by greater infiltration rates at higher floors. Both outdoor and indoor concentrations peak around noon. We find that indoor and outdoor PM2.5 are positively correlated, with indoor levels consistently remaining lower than outside levels. These findings align with previous research suggesting decreasing outdoor air pollution concentrations with increasing height. The study informs decision-making by community members and policymakers regarding air pollution exposure in urban settings. 
    more » « less
  4. Breathing in fine particulate matter of diameter less than 2.5 µm (PM2.5) greatly increases an individual’s risk of cardiovascular and respiratory diseases. As climate change progresses, extreme weather events, including wildfires, are expected to increase, exacerbating air pollution. However, models often struggle to capture extreme pollution events due to the rarity of high PM2.5 levels in training datasets. To address this, we implemented cluster-based undersampling and trained Transformer models to improve extreme event prediction using various cutoff thresholds (12.1 µg/m3 and 35.5 µg/m3) and partial sampling ratios (10/90, 20/80, 30/70, 40/60, 50/50). Our results demonstrate that the 35.5 µg/m3 threshold, paired with a 20/80 partial sampling ratio, achieved the best performance, with an RMSE of 2.080, MAE of 1.386, and R2 of 0.914, particularly excelling in forecasting high PM2.5 events. Overall, models trained on augmented data significantly outperformed those trained on original data, highlighting the importance of resampling techniques in improving air quality forecasting accuracy, especially for high-pollution scenarios. These findings provide critical insights into optimizing air quality forecasting models, enabling more reliable predictions of extreme pollution events. By advancing the ability to forecast high PM2.5 levels, this study contributes to the development of more informed public health and environmental policies to mitigate the impacts of air pollution, and advanced the technology for building better air quality digital twins. 
    more » « less
  5. null (Ed.)
    Background: Wearable technology is used by clinicians and researchers and play a critical role in biomechanical assessments and rehabilitation. Objective: The purpose of this research is to validate a soft robotic stretch (SRS) sensor embedded in a compression knee brace (smart knee brace) against a motion capture system focusing on knee joint kinematics. Methods: Sixteen participants donned the smart knee brace and completed three separate tasks: non-weight bearing knee flexion/extension, bodyweight air squats, and gait trials. Adjusted R2 for goodness of fit (R2), root mean square error (RMSE), and mean absolute error (MAE) between the SRS sensor and motion capture kinematic data for all three tasks were assessed. Results: For knee flexion/extension: R2 = 0.799, RMSE = 5.470, MAE = 4.560; for bodyweight air squats: R2 = 0.957, RMSE = 8.127, MAE = 6.870; and for gait trials: R2 = 0.565, RMSE = 9.190, MAE = 7.530 were observed. Conclusions: The smart knee brace demonstrated a higher goodness of fit and accuracy during weight-bearing air squats followed by non-weight bearing knee flexion/extension and a lower goodness of fit and accuracy during gait, which can be attributed to the SRS sensor position and orientation, rather than range of motion achieved in each task. 
    more » « less