skip to main content


Title: A State-Level Socioeconomic Data Collection of the United States for COVID-19 Research
The outbreak of COVID-19 from late 2019 not only threatens the health and lives of humankind but impacts public policies, economic activities, and human behavior patterns significantly. To understand the impact and better prepare for future outbreaks, socioeconomic factors play significant roles in (1) determinant analysis with health care, environmental exposure and health behavior; (2) human mobility analyses driven by policies; (3) economic pressure and recovery analyses for decision making; and (4) short to long term social impact analysis for equity, justice and diversity. To support these analyses for rapid impact responses, state level socioeconomic factors for the United States of America (USA) are collected and integrated into topic-based indicators, including (1) the daily quantitative policy stringency index; (2) dynamic economic indices with multiple time frequency of GDP, international trade, personal income, employment, the housing market, and others; (3) the socioeconomic determinant baseline of the demographic, housing financial situation and medical resources. This paper introduces the measurements and metadata of relevant socioeconomic data collection, along with the sharing platform, data warehouse framework and quality control strategies. Different from existing COVID-19 related data products, this collection recognized the geospatial and dynamic factor as essential dimensions of epidemiologic research and scaled down the spatial resolution of socioeconomic data collection from country level to state level of the USA with a standard data format and high quality.  more » « less
Award ID(s):
2027521 1841520 1835507
NSF-PAR ID:
10208494
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Data
Volume:
5
Issue:
4
ISSN:
2306-5729
Page Range / eLocation ID:
118
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Deep Learning for Time-series plays a key role in AI for healthcare. To predict the progress of infectious disease outbreaks and demonstrate clear population-level impact, more granular analyses are urgently needed that control for important and potentially confounding county-level socioeconomic and health factors. We forecast US county-level COVID-19 infections using the Temporal Fusion Transformer (TFT). We focus on heterogeneous time-series deep learning model prediction while interpreting the complex spatiotemporal features learned from the data. The significance of the work is grounded in a real-world COVID-19 infection prediction with highly non-stationary, finely granular, and heterogeneous data. 1) Our model can capture the detailed daily changes of temporal and spatial model behaviors and achieves better prediction performance compared to other time-series models. 2) We analyzed the attention patterns from TFT to interpret the temporal and spatial patterns learned by the model. 3) We collected around 2.5 years of socioeconomic and health features for 3142 US counties, such as observed cases, and a number of static (age distribution and health disparity) and dynamic features (vaccination, disease spread, transmissible cases, and social distancing). Using the proposed framework, we have shown that our model can learn complex interactions. Interpreting different impacts at the county level would be crucial for understanding the infection process that can help effective public health decision-making. 
    more » « less
  2. Deep Learning for Time-series plays a key role in AI for healthcare. To predict the progress of infectious disease outbreaks and demonstrate clear population-level impact, more granular analyses are urgently needed that control for important and potentially confounding county-level socioeconomic and health factors. We forecast US county-level COVID-19 infections using the Temporal Fusion Transformer (TFT). We focus on heterogeneous time-series deep learning model prediction while interpreting the complex spatiotemporal features learned from the data. The significance of the work is grounded in a real-world COVID-19 infection prediction with highly non-stationary, finely granular, and heterogeneous data. 1) Our model can capture the detailed daily changes of temporal and spatial model behaviors and achieves better prediction performance compared to other time-series models. 2) We analyzed the attention patterns from TFT to interpret the temporal and spatial patterns learned by the model. 3) We collected around 2.5 years of socioeconomic and health features for 3142 US counties, such as observed cases, and a number of static (age distribution and health disparity) and dynamic features (vaccination, disease spread, transmissible cases, and social distancing). Using the proposed framework, we have shown that our model can learn complex interactions. Interpreting different impacts at the county level would be crucial for understanding the infection process that can help effective public health decision-making. 
    more » « less
  3. Dynamic models are used to assess the impact of three types of face masks (cloth masks, surgical/procedure masks and respirators) in controlling the COVID-19 pandemic in the USA. We showed that the pandemic would have failed to establish in the USA if a nationwide mask mandate, based on using respirators with moderately high compliance, had been implemented during the first two months of the pandemic. The other mask types would fail to prevent the pandemic from becoming established. When mask usage compliance is low to moderate, respirators are far more effective in reducing disease burden. Using data from the third wave, we showed that the epidemic could be eliminated in the USA if at least 40% of the population consistently wore respirators in public. Surgical masks can also lead to elimination, but requires compliance of at least 55%. Daily COVID-19 mortality could be eliminated in the USA by June or July 2021 if 95% of the population opted for either respirators or surgical masks from the beginning of the third wave. We showed that the prospect of effective control or elimination of the pandemic using mask-based strategy is greatly enhanced if combined with other non-pharmaceutical interventions (NPIs) that significantly reduce the baseline community transmission. By slightly modifying the model to include the effect of a vaccine against COVID-19 and waning vaccine-derived and natural immunity, this study shows that the waning of such immunity could trigger multiple new waves of the pandemic in the USA. The number, severity and duration of the projected waves depend on the quality of mask type used and the level of increase in the baseline levels of other NPIs used in the community during the onset of the third wave of the pandemic in the USA. Specifically, no severe fourth or subsequent wave of the pandemic will be recorded in the USA if surgical masks or respirators are used, particularly if the mask use strategy is combined with an increase in the baseline levels of other NPIs. This study further emphasizes the role of human behaviour towards masking on COVID-19 burden, and highlights the urgent need to maintain a healthy stockpile of highly effective respiratory protection, particularly respirators, to be made available to the general public in times of future outbreaks or pandemics of respiratory diseases that inflict severe public health and socio-economic burden on the population. 
    more » « less
  4. null (Ed.)
    The sudden outbreak of the COVID-19 pandemic has brought drastic changes to people’s daily lives, work, and the surrounding environment. Investigations into these changes are very important for decision makers to implement policies on economic loss assessments and stimulation packages, city reopening, resilience of the environment, and arrangement of medical resources. In order to analyze the impact of COVID-19 on people’s lives, activities, and the natural environment, this paper investigates the spatial and temporal characteristics of Nighttime Light (NTL) radiance and Air Quality Index (AQI) before and during the pandemic in mainland China. The monthly mean NTL radiance, and daily and monthly mean AQI are calculated over mainland China and compared before and during the pandemic. Our results show that the monthly average NTL brightness is much lower during the quarantine period than before. This study categorizes NTL into three classes: residential area, transportation, and public facilities and commercial centers, with NTL radiance ranges of 5–20, 20–40 and greater than 40 (nW· cm − 2 · sr − 1 ), respectively. We found that the Number of Pixels (NOP) with NTL detection increased in the residential area and decreased in the commercial centers for most of the provinces after the shutdown, while transportation and public facilities generally stayed the same. More specifically, we examined these factors in Wuhan, where the first confirmed cases were reported, and where the earliest quarantine measures were taken. Observations and analysis of pixels associated with commercial centers were observed to have lower NTL radiance values, indicating a dimming behavior, while residential area pixels recorded increased levels of brightness after the beginning of the lockdown. The study also discovered a significant decreasing trend in the daily average AQI for mainland China from January to March 2020, with cleaner air in most provinces during February and March, compared to January 2020. In conclusion, the outbreak and spread of COVID-19 has had a crucial impact on people’s daily lives and activity ranges through the increased implementation of lockdown and quarantine policies. On the other hand, the air quality of mainland China has improved with the reduction in non-essential industries and motor vehicle usage. This evidence demonstrates that the Chinese government has executed very stringent quarantine policies to deal with the pandemic. The decisive response to control the spread of COVID-19 provides a reference for other parts of the world. 
    more » « less
  5. Yang, Chaowei (Ed.)
    The COVID-19 pandemic has profoundly impacted the economy and human lives worldwide, particularly the vulnerable low-income population. We employ a large panel data of 5.6 million daily transactions from 2.6 million debit cards owned by the low-income population in the U.S. to quantify the joint impacts of the state lockdowns and stimulus payments on this population’s spending along the inter-temporal, geo-spatial, and cross-categorical dimensions. Leveraging the difference-in-differences analyses at the per card and zip code levels, we uncover three key findings. (1) Inter-temporally, the state lockdowns diminished the daily average spending relative to the same period in 2019 by $3.9 per card and $2,214 per zip code, whereas the stimulus payments elevated the daily average spending by $15.7 per card and $3,307 per zip code. (2) Spatial heterogeneity prevailed: Democratic zip codes displayed much more volatile dynamics, with an initial decline three times that of Republican zip codes, followed by a higher rebound and a net gain after the stimulus payments; also, Southwest exhibited the highest initial decline whereas Southeast had the largest net gain after the stimulus payments. (3) Across 26 categories, the stimulus payments promoted spending in those categories that enhanced public health and charitable donations, reduced food insecurity and digital divide, while having also stimulated non-essential and even undesirable categories, such as liquor and cigar. In addition, spatial association analysis was employed to identify spatial dependency and local hot spots of spending changes at the county level. Overall, these analyses reveal the imperative need for more geo- and category-targeted stimulus programs, as well as more effective and strategic policy communications, to protect and promote the well-being of the low-income population during public health and economic crises. 
    more » « less