skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Cain, Jacob"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Accurate air pollution monitoring is critical to understand and mitigate the impacts of air pollution on human health and ecosystems. Due to the limited number and geographical coverage of advanced, highly accurate sensors monitoring air pollutants, many low-cost and low-accuracy sensors have been deployed. Calibrating low-cost sensors is essential to fill the geographical gap in sensor coverage. We systematically examined how different machine learning (ML) models and open-source packages could help improve the accuracy of particulate matter (PM) 2.5 data collected by Purple Air sensors. Eleven ML models and five packages were examined. This systematic study found that both models and packages impacted accuracy, while the random training/testing split ratio (e.g., 80/20 vs. 70/30) had minimal impact (0.745% difference for R2). Long Short-Term Memory (LSTM) models trained in RStudio and TensorFlow excelled, with high R2 scores of 0.856 and 0.857 and low Root Mean Squared Errors (RMSEs) of 4.25 µg/m3 and 4.26 µg/m3, respectively. However, LSTM models may be too slow (1.5 h) or computation-intensive for applications with fast response requirements. Tree-boosted models including XGBoost (0.7612, 5.377 µg/m3) in RStudio and Random Forest (RF) (0.7632, 5.366 µg/m3) in TensorFlow offered good performance with shorter training times (<1 min) and may be suitable for such applications. These findings suggest that AI/ML models, particularly LSTM models, can effectively calibrate low-cost sensors to produce precise, localized air quality data. This research is among the most comprehensive studies on AI/ML for air pollutant calibration. We also discussed limitations, applicability to other sensors, and the explanations for good model performances. This research can be adapted to enhance air quality monitoring for public health risk assessments, support broader environmental health initiatives, and inform policy decisions. 
    more » « less
    Free, publicly-accessible full text available February 1, 2026
  2. The COVID-19 pandemic has been sweeping across the United States of America since early 2020. The whole world was waiting for vaccination to end this pandemic. Since the approval of the first vaccine by the U.S. CDC on 9 November 2020, nearly 67.5% of the US population have been fully vaccinated by 10 July 2022. While quite successful in controlling the spreading of COVID-19, there were voices against vaccines. Therefore, this research utilizes geo-tweets and Bayesian-based method to investigate public opinions towards vaccines based on (1) the spatiotemporal changes in public engagement and public sentiment; (2) how the public engagement and sentiment react to different vaccine-related topics; (3) how various races behave differently. We connected the phenomenon observed to real-time and historical events. We found that in general the public is positive towards COVID-19 vaccines. Public sentiment positivity went up as more people were vaccinated. Public sentiment on specific topics varied in different periods. African Americans’ sentiment toward vaccines was relatively lower than other races. 
    more » « less
  3. Batten disease is unique among lysosomal storage disorders for the early and profound manifestation in the central nervous system, but little is known regarding potential neuron-specific roles for the disease-associated proteins. We demonstrate substantial overlap in the protein interactomes of three transmembrane Batten proteins (CLN3, CLN6, and CLN8), and that their absence leads to synaptic depletion of key partners (i.e., SNAREs and tethers) and altered synaptic SNARE complexing in vivo , demonstrating a novel shared etiology. 
    more » « less