skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A framework of zero-inflated Bayesian negative binomial regression models for spatiotemporal data
Spatiotemporal data analysis with massive zeros is widely used in many areas such as epidemiology and public health. We use a Bayesian framework to fit zero-inflated negative binomial models and employ a set of latent variables from Pólya-Gamma distributions to derive an efficient Gibbs sampler. The proposed model accommodates varying spatial and temporal random effects through Gaussian process priors, which have both the simplicity and flexibility in modeling nonlinear relationships through a covariance function. To conquer the computation bottleneck that GPs may suffer when the sample size is large, we adopt the nearest-neighbor GP approach that approximates the covariance matrix using local experts. For the simulation study, we adopt multiple settings with varying sizes of spatial locations to evaluate the performance of the proposed model such as spatial and temporal random effects estimation and compare the result to other methods. We also apply the proposed model to the COVID-19 death counts in the state of Florida, USA from 3/25/2020 through 7/29/2020 to examine relationships between social vulnerability and COVID-19 deaths.  more » « less
Award ID(s):
1924792 2318925
PAR ID:
10479170
Author(s) / Creator(s):
;
Publisher / Repository:
Elsevier
Date Published:
Journal Name:
Journal of Statistical Planning and Inference
Volume:
229
Issue:
C
ISSN:
0378-3758
Page Range / eLocation ID:
106098
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The COVID-19 pandemic has dramatically transformed human mobility patterns. Therefore, human mobility prediction for the “new normal” is crucial to infrastructure redesign, emergency management, and urban planning post the pandemic. This paper aims to predict people’s number of visits to various locations in New York City using COVID and mobility data in the past two years. To quantitatively model the impact of COVID cases on human mobility patterns and predict mobility patterns across the pandemic period, this paper develops a model CCAAT-GCN (Cross- andContext-Attention based Spatial-TemporalGraphConvolutionalNetworks). The proposed model is validated using SafeGraph data in New York City from August 2020 to April 2022. A rich set of baselines are performed to demonstrate the performance of our proposed model. Results demonstrate the superior performance of our proposed method. Also, the attention matrix learned by our model exhibits a strong alignment with the COVID-19 situation and the points of interest within the geographic region. This alignment suggests that the model effectively captures the intricate relationships between COVID-19 case rates and human mobility patterns. The developed model and findings can offer insights into the mobility pattern prediction for future disruptive events and pandemics, so as to assist with emergency preparedness for planners, decision-makers and policymakers. 
    more » « less
  2. Estimating human mobility responses to the large-scale spreading of the COVID-19 pandemic is crucial, since its significance guides policymakers to give Non-pharmaceutical Interventions, such as closure or reopening of businesses. It is challenging to model due to complex social contexts and limited training data. Recently, we proposed a conditional generative adversarial network (COVID-GAN) to estimate human mobility response under a set of social and policy conditions integrated from multiple data sources. Although COVID-GAN achieves a good average estimation accuracy under real-world conditions, it produces higher errors in certain regions due to the presence of spatial heterogeneity and outliers. To address these issues, in this article, we extend our prior work by introducing a new spatio-temporal deep generative model, namely, COVID-GAN+. COVID-GAN+ deals with the spatial heterogeneity issue by introducing a new spatial feature layer that utilizes the local Moran statistic to model the spatial heterogeneity strength in the data. In addition, we redesign the training objective to learn the estimated mobility changes from historical average levels to mitigate the effects of spatial outliers. We perform comprehensive evaluations using urban mobility data derived from cell phone records and census data. Results show that COVID-GAN+ can better approximate real-world human mobility responses than prior methods, including COVID-GAN. 
    more » « less
  3. Water quality is affected by multiple spatial and temporal factors, including the surrounding land characteristics, human activities, and antecedent precipitation amounts. However, identifying the relationships between water quality and spatially and temporally varying environmental variables with a machine learning technique in a heterogeneous urban landscape has been understudied. We explore how seasonal and variable precipitation amounts and other small-scale landscape variables affect E. coli, total suspended solids (TSS), nitrogen-nitrate, orthophosphate, lead, and zinc concentrations in Portland, Oregon, USA. Mann–Whitney tests were used to detect differences in water quality between seasons and COVID-19 periods. Spearman’s rank correlation analysis was used to identify the relationship between water quality and explanatory variables. A Random Forest (RF) model was used to predict water quality using antecedent precipitation amounts and landscape variables as inputs. The performance of RF was compared with that of ordinary least squares (OLS). Mann–Whitney tests identified statistically significant differences in all pollutant concentrations (except TSS) between the wet and dry seasons. Nitrate was the only pollutant to display statistically significant reductions in median concentrations (from 1.5 mg/L to 1.04 mg/L) during the COVID-19 lockdown period, likely associated with reduced traffic volumes. Spearman’s correlation analysis identified the highest correlation coefficients between one-day precipitation amounts and E. coli, lead, zinc, and TSS concentrations. Road length is positively associated with E. coli and zinc. The Random Forest (RF) model best predicts orthophosphate concentrations (R2 = 0.58), followed by TSS (R2 = 0.54) and nitrate (R2 = 0.46). E. coli was the most difficult to model and had the highest RMSE, MAE, and MAPE values. Overall, the Random Forest model outperformed OLS, as evaluated by RMSE, MAE, MAPE, and R2. The Random Forest was an effective approach to modeling pollutant concentrations using both categorical seasonal and COVID data along with continuous rain and landscape variables to predict water quality in urban streams. Implementing optimization techniques can further improve the model’s performance and allow researchers to use a machine learning approach for water quality modeling. 
    more » « less
  4. While COVID-19 text misinformation has already been investigated by various scholars, fewer research efforts have been devoted to characterizing and understanding COVID-19 misinformation that is carried out through visuals like photographs and memes. In this paper, we present a mixed-method analysis of image-based COVID-19 misinformation in 2020 on Twitter. We deploy a computational pipeline to identify COVID-19 related tweets, download the images contained in them, and group together visually similar images. We then develop a codebook to characterize COVID-19 misinformation and manually label images as misinformation or not. Finally, we perform a quantitative analysis of tweets containing COVID-19 misinformation images. We identify five types of COVID-19 misinformation, from a wrong understanding of the threat severity of COVID-19 to the promotion of fake cures and conspiracy theories. We also find that tweets containing COVID-19 misinformation images do not receive more interactions than baseline tweets with random images posted by the same set of users. As for temporal properties, COVID-19 misinformation images are shared for longer periods of time than non-misinformation ones, as well as have longer burst times. %\ywi added "have'' %\ywFor RQ2, we compare non-misinformation images instead of random images, and so it is not a direct comparison. When looking at the users sharing COVID-19 misinformation images on Twitter from the perspective of their political leanings, we find that pro-Democrat and pro-Republican users share a similar amount of tweets containing misleading or false COVID-19 images. However, the types of images that they share are different: while pro-Democrat users focus on misleading claims about the Trump administration's response to the pandemic, as well as often sharing manipulated images intended as satire, pro-Republican users often promote hydroxychloroquine, an ineffective medicine against COVID-19, as well as conspiracy theories about the origin of the virus. Our analysis sets a basis for better understanding COVID-19 misinformation images on social media and the nuances in effectively moderate them. 
    more » « less
  5. Beginning in early 2020, the novel coronavirus was the subject of frequent and sustained news coverage. Building on prior literature on the stress-inducing effects of consuming news during a large-scale crisis, we used network analysis to investigate the association between coronavirus disease 2019 (COVID-19) news consumption, COVID-19-related psychological stress, worries about oneself and one’s loved ones getting COVID-19, and sleep quality. Data were collected in March 2020 from 586 adults (45.2% female; 72.9% White) recruited via Amazon Mechanical Turk in the U.S. Participants completed online surveys assessing attitudes and behaviors related to COVID-19 and a questionnaire assessing seven domains of sleep quality. Networks were constructed using partial regularized correlation matrices. As hypothesized, COVID-19 news consumption was positively associated with COVID-19-related psychological stress and concerns about one’s loved ones getting COVID-19. However, there were very few associations between COVID-19 news consumption and sleep quality indices, and gender did not moderate any of the observed relationships. This study replicates and extends previous findings that COVID-19-news consumption is linked with psychological stress related to the pandemic, but even under such conditions, sleep quality can be spared due to the pandemic allowing for flexibility in morning work/school schedules. 
    more » « less