skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on December 31, 2025

Title: Using Machine Learning in Analyzing Air Quality Discrepancies of Environmental Impact
In this study, we apply machine learning and software engineering in analyzing air pollution levels in City of Baltimore. The data model was fed with three primary data sources: 1) a biased method of estimating insurance risk used by homeowners loan corporation, 2) demographics of Baltimore residents, and 3) census data estimate of NO2 and PM2.5 concentrations. The dataset covers 650,643 Baltimore residents in 44.7 million residents in 202 major cities in US. The results show that air pollution levels have a clear association with the biased insurance estimating method. Great disparities present in NO2 level between more desirable and low income blocks. Similar disparities exist in air pollution level between residents' ethnicity. As Baltimore population consists of a greater proportion of people of color, the finding reveals how decades old policies has continued to discriminate and affect quality of life of Baltimore citizens today. A QML-based feature mapping is applied on a small dataset.  more » « less
Award ID(s):
2329053
PAR ID:
10634670
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Proc. IEEE AI and DKE 2024 https://arxiv.org/abs/2506.17319
Date Published:
Format(s):
Medium: X
Institution:
https://arxiv.org/abs/2506.17319
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Belmont County, Ohio is heavily dominated by unconventional oil and gas development that results in high levels of ambient air pollution. Residents here chose to work with a national volunteer network to develop a method of participatory science to answer questions about the association between impact on the health of their community and pollution exposure from the many industrial point sources in the county and surrounding area and river valley. After first directing their questions to the government agencies responsible for permitting and protecting public health, residents noted the lack of detailed data and understanding of the impact of these industries. These residents and environmental advocates are using the resulting science to open a dialogue with the EPA in hopes to ultimately collaboratively develop air quality standards that better protect public health. Results from comparing measurements from a citizen-led participatory low-cost, high-density air pollution sensor network of 35 particulate matter and 25 volatile organic compound sensors against regulatory monitors show low correlations (consistently R 2 < 0.55). This network analysis combined with complementary models of emission plumes are revealing the inadequacy of the sparse regulatory air pollution monitoring network in the area, and opening many avenues for public health officials to further verify people’s experiences and act in the interest of residents’ health with enforcement and informed permitting practices. Further, the collaborative best practices developed by this study serve as a launchpad for other community science efforts looking to monitor local air quality in response to industrial growth. 
    more » « less
  2. We present a novel source attribution approach that incorporates satellite data into GEOS-Chem adjoint simulations to characterize the species-specific, regional, and sectoral contributions of daily emissions for 3 air pollutants: fine particulate matter (PM2.5), ozone (O3), and nitrogen dioxide (NO2). This approach is implemented for Washington, DC, first for 2011, to identify urban pollution sources, and again for 2016, to examine the pollution response to changes in anthropogenic emissions. In 2011, anthropogenic emissions contributed an estimated 263 (uncertainty: 130–444) PM2.5- and O3-attributable premature deaths and 1,120 (391–1795) NO2 attributable new pediatric asthma cases in DC. PM2.5 exposure was responsible for 90% of these premature deaths. On-road vehicle emissions contributed 51% of NO2-attributable new asthma cases and 23% of pollution-attributable premature deaths, making it the largest contributing individual sector to DC’s air pollution–related health burden. Regional emissions, originating from Maryland, Virginia, and Pennsylvania, were the most responsible for pollution-related health impacts in DC, contributing 57% of premature deaths impacts and 89% of asthma cases. Emissions from distant states contributed 34% more to PM2.5 exposure in the wintertime than in the summertime, occurring in parallel with strong wintertime westerlies and a reduced photochemical sink. Emission reductions between 2011 and 2016 resulted in health benefits of 76 (28–149) fewer pollution-attributable premature deaths and 227 (2–617) fewer NO2-attributable pediatric asthma cases. The largest sectors contributing to decreases in pollution-related premature deaths were energy generation units (26%) and on-road vehicles (20%). Decreases in NO2-attributable pediatric asthma cases were mostly due to emission reductions from on-road vehicles (63%). Emission reductions from energy generation units were found to impact PM2.5 more than O3, while on-road vehicle emission reductions impacted O3 proportionally more than PM2.5. This novel method is capable of capturing the sources of urban pollution at fine spatial and temporal scales and is applicable to many urban environments, globally. 
    more » « less
  3. null (Ed.)
    Environmental stresses borne of population growth, consumerism and industrialization have subjected many populations worldwide to elevated air pollution. Philadelphia, a historically industrial city in Northeastern United States, is ranked in the top 25 cities in the country for harmful air pollutants (PM2.5, ozone). Philadelphia also experiences 􀁋􀁖􀁉􀁅􀁘􀀄 􀆼􀁒􀁅􀁒􀁇􀁍􀁅􀁐􀀄 􀁗􀁘􀁖􀁅􀁘􀁍􀆼􀁇􀁅􀁘􀁍􀁓􀁒􀀄 􀁅􀁒􀁈􀀄 􀁉􀁒􀁚􀁍􀁖􀁓􀁒􀁑􀁉􀁒􀁘􀁅􀁐􀀄 􀁖􀁅􀁇􀁍􀁗􀁑􀀐􀀄 􀁛􀁌􀁍􀁇􀁌􀀄 􀁓􀁊􀁘􀁉􀁒􀀄 􀁙􀁒􀁊􀁅􀁍􀁖􀁐􀁝􀀄 􀁅􀁗􀁗􀁉􀁖􀁘􀁗􀀄 􀁘􀁌􀁉􀀄 􀁔􀁅􀁍􀁒􀁗􀀄 􀁓􀁊􀀄 􀁉􀁒􀁚􀁍􀁖􀁓􀁒􀁑􀁉􀁒􀁘􀁅􀁐􀀄 pollution & associated health effects on socioeconomically disadvantaged communities. This study seeks to succinctly 􀁕􀁙􀁅􀁒􀁘􀁍􀁊􀁝􀀄􀁛􀁌􀁍􀁇􀁌􀀄􀁔􀁓􀁔􀁙􀁐􀁅􀁘􀁍􀁓􀁒􀁗􀀄􀁑􀁅􀁝􀀄􀁆􀁉􀀄􀁅􀁘􀀄􀁖􀁍􀁗􀁏􀀄􀁊􀁓􀁖􀀄􀁌􀁉􀁅􀁐􀁘􀁌􀀄􀁉􀁊􀁊􀁉􀁇􀁘􀁗􀀄􀁅􀁗􀁗􀁓􀁇􀁍􀁅􀁘􀁉􀁈􀀄􀁛􀁍􀁘􀁌􀀄􀁅􀁍􀁖􀀄􀁔􀁓􀁐􀁐􀁙􀁘􀁍􀁓􀁒􀀄􀀌􀁗􀁔􀁉􀁇􀁍􀆼􀁇􀁅􀁐􀁐􀁝􀀄􀁅􀁗􀁘􀁌􀁑􀁅􀀐􀀄􀁇􀁌􀁖􀁓􀁒􀁍􀁇􀀄 obstructive pulmonary disease) through a suite of census-derived attributes. Using ArcMap Geographical Information System software (ESRI), attributes, categorized as promoting vulnerability or adaptability, are combined with air pollution data collected in summer 2019 to form a non-weighted ‘Social Vulnerability Index’ (SVI) at a census-tract level for Philadelphia. SVI demonstrated several clusters of neighborhoods with great disparities in socioeconomic factors. The census tracts with higher SVI tended to have higher levels of asthma and COPD (and vice versa). With improvements and acknowledgement of Philadelphia’s uniqueness, SVI of this kind may be used to inform policymakers on city planning (e.g. placement of future highways, industrial centers, etc.) to alleviate compounded respiratory/ pulmonary-related stresses on disadvantaged communities. Future analysis including green space coverage, other 􀁊􀁓􀁖􀁑􀁗􀀄􀁓􀁊􀀄􀁅􀁍􀁖􀀄􀁔􀁓􀁐􀁐􀁙􀁘􀁍􀁓􀁒􀀐􀀄􀁅􀁒􀁈􀀓􀁓􀁖􀀄􀁅􀀄􀁕􀁙􀁅􀁒􀁘􀁍􀆼􀁇􀁅􀁘􀁍􀁓􀁒􀀄􀁓􀁊􀀄􀁗􀁓􀁇􀁍􀁅􀁐􀀄􀁇􀁓􀁒􀁒􀁉􀁇􀁘􀁍􀁚􀁍􀁘􀁝􀀄􀁑􀁅􀁝􀀄􀁌􀁉􀁐􀁔􀀄􀁘􀁓􀀄􀁍􀁑􀁔􀁖􀁓􀁚􀁉􀀄􀁊􀁙􀁖􀁘􀁌􀁉􀁖􀀄􀁙􀁒􀁈􀁉􀁖􀁗􀁘􀁅􀁒􀁈􀁍􀁒􀁋􀀄􀁓􀁊􀀄􀁘􀁌􀁉􀀄 intersection between socioeconomic factors, air pollution, and health in Philadelphia, PA. 
    more » « less
  4. null (Ed.)
    A substantial reduction in global transport and industrial processes stemming from the novel SARS-CoV-2 coronavirus and subsequent pandemic resulted in sharp declines in emissions, including for NO2. This has implications for human health, given the role that this gas plays in pulmonary disease and the findings that past exposure to air pollutants has been linked to the most adverse outcomes from COVID-19 disease, likely via various co-morbidities. To explore how much COVID-19 shutdown policies impacted urban air quality, we examined ground-based NO2 sensor data from 11 U.S. cities from a two-month window (March–April) during shutdown in 2020, controlling for natural seasonal variability by using average changes in NO2 over the previous five years for these cities. Levels of NO2 and VMT reduction in March and April compared to January 2020 ranged between 11–65% and 11–89%, consistent with a sharp drop in vehicular traffic from shutdown-related travel restrictions. To explore this link closely, we gathered detailed traffic count data in one city—Indianapolis, Indiana—and found a strong correlation (0.90) between traffic counts/classification and vehicle miles travelled, a moderate correlation (0.54) between NO2 and traffic related data, and an average reduction of 1.11 ppb of NO2 linked to vehicular data. This finding indicates that targeted reduction in pollutants like NO2 can be made by manipulating traffic patterns, thus potentially leading to more population-level health resilience in the future. 
    more » « less
  5. Abstract. Smoke from wildfires is a significant source of air pollution, which can adversely impact air quality and ecosystems downwind. With the recently increasing intensity and severity of wildfires, the threat to air quality is expected to increase. Satellite-derived biomass burning emissions can fill in gaps in the absence of aircraft or ground-based measurement campaigns and can help improve the online calculation of biomass burning emissions as well as the biomass burning emissions inventories that feed air quality models. This study focuses on satellite-derived NOx emissions using the high-spatial-resolution TROPOspheric Monitoring Instrument (TROPOMI) NO2 dataset. Advancements and improvements to the satellite-based determination of forest fire NOx emissions are discussed, including information on plume height and effects of aerosol scattering and absorption on the satellite-retrieved vertical column densities. Two common top-down emission estimation methods, (1) an exponentially modified Gaussian (EMG) and (2) a flux method, are applied to synthetic data to determine the accuracy and the sensitivity to different parameters, including wind fields, satellite sampling, noise, lifetime, and plume spread. These tests show that emissions can be accurately estimated from single TROPOMI overpasses.The effect of smoke aerosols on TROPOMI NO2 columns (via air mass factors, AMFs) is estimated, and these satellite columns and emission estimates are compared to aircraft observations from four different aircraft campaigns measuring biomass burning plumes in 2018 and 2019 in North America. Our results indicate that applying an explicit aerosol correction to the TROPOMI NO2 columns improves the agreement with the aircraft observations (by about 10 %–25 %). The aircraft- and satellite-derived emissions are in good agreement within the uncertainties. Both top-down emissions methods work well; however, the EMG method seems to output more consistent results and has better agreement with the aircraft-derived emissions. Assuming a Gaussian plume shape for various biomass burning plumes, we estimate an average NOx e-folding time of 2 ±1 h from TROPOMI observations. Based on chemistry transport model simulations and aircraft observations, the net emissions of NOx are 1.3 to 1.5 times greater than the satellite-derived NO2 emissions. A correction factor of 1.3 to 1.5 should thus be used to infer net NOx emissions from the satellite retrievals of NO2. 
    more » « less