skip to main content


Title: Building an Open Resources Repository for COVID-19 Research
Abstract The COVID-19 outbreak is a global pandemic declared by the World Health Organization, with rapidly increasing cases in most countries. A wide range of research is urgently needed for understanding the COVID-19 pandemic, such as transmissibility, geographic spreading, risk factors for infections, and economic impacts. Reliable data archive and sharing are essential to jump-start innovative research to combat COVID-19. This research is a collaborative and innovative effort in building such an archive, including the collection of various data resources relevant to COVID-19 research, such as daily cases, social media, population mobility, health facilities, climate, socioeconomic data, research articles, policy and regulation, and global news. Due to the heterogeneity between data sources, our effort also includes processing and integrating different datasets based on GIS (Geographic Information System) base maps to make them relatable and comparable. To keep the data files permanent, we published all open data to the Harvard Dataverse ( https://dataverse.harvard.edu/dataverse/2019ncov ), an online data management and sharing platform with a permanent Digital Object Identifier number for each dataset. Finally, preliminary studies are conducted based on the shared COVID-19 datasets and revealed different spatial transmission patterns among mainland China, Italy, and the United States.  more » « less
Award ID(s):
2027540
NSF-PAR ID:
10288430
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; « less
Date Published:
Journal Name:
Data and Information Management
Volume:
4
Issue:
3
ISSN:
2543-9251
Page Range / eLocation ID:
130 to 147
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The systemic challenges of the COVID-19 pandemic require cross-disciplinary collaboration in a global and timely fashion. Such collaboration needs open research practices and the sharing of research outputs, such as data and code, thereby facilitating research and research reproducibility and timely collaboration beyond borders. The Research Data Alliance COVID-19 Working Group recently published a set of recommendations and guidelines on data sharing and related best practices for COVID-19 research. These guidelines include recommendations for researchers, policymakers, funders, publishers and infrastructure providers from the perspective of different domains (Clinical Medicine, Omics, Epidemiology, Social Sciences, Community Participation, Indigenous Peoples, Research Software, Legal and Ethical Considerations). Several overarching themes have emerged from this document such as the need to balance the creation of data adherent to FAIR principles (findable, accessible, interoperable and reusable), with the need for quick data release; the use of trustworthy research data repositories; the use of well-annotated data with meaningful metadata; and practices of documenting methods and software. The resulting document marks an unprecedented cross-disciplinary, cross-sectoral, and cross-jurisdictional effort authored by over 160 experts from around the globe. This letter summarises key points of the Recommendations and Guidelines, highlights the relevant findings, shines a spotlight on the process, and suggests how these developments can be leveraged by the wider scientific community. 
    more » « less
  2. Background The surge of telemedicine use during the early stages of the COVID-19 pandemic has been well documented. However, scarce evidence considers the use of telemedicine in the subsequent period. Objective This study aims to evaluate use patterns of video-based telemedicine visits for ambulatory care and urgent care provision over the course of recurring pandemic waves in 1 large health system in New York City (NYC) and what this means for health care delivery. Methods Retrospective electronic health record (EHR) data of patients from January 1, 2020, to February 28, 2022, were used to longitudinally track and analyze telemedicine and in-person visit volumes across ambulatory care specialties and urgent care, as well as compare them to a prepandemic baseline (June-November 2019). Diagnosis codes to differentiate suspected COVID-19 visits from non–COVID-19 visits, as well as evaluating COVID-19–based telemedicine use over time, were compared to the total number of COVID-19–positive cases in the same geographic region (city level). The time series data were segmented based on change-point analysis, and variances in visit trends were compared between the segments. Results The emergence of COVID-19 prompted an early increase in the number of telemedicine visits across the urgent care and ambulatory care settings. This use continued throughout the pandemic at a much higher level than the prepandemic baseline for both COVID-19 and non–COVID-19 suspected visits, despite the fluctuation in COVID-19 cases throughout the pandemic and the resumption of in-person clinical services. The use of telemedicine-based urgent care services for COVID-19 suspected visits showed more variance in response to each pandemic wave, but telemedicine visits for ambulatory care have remained relatively steady after the initial crisis period. During the Omicron wave, the use of all visit types, including in-person activities, decreased. Patients between 25 and 34 years of age were the largest users of telemedicine-based urgent care. Patient satisfaction with telemedicine-based urgent care remained high despite the rapid scaling of services to meet increased demand. Conclusions The trend of the increased use of telemedicine as a means of health care delivery relative to the pre–COVID-19 baseline has been maintained throughout the later pandemic periods despite fluctuating COVID-19 cases and the resumption of in-person care delivery. Overall satisfaction with telemedicine-based care is also high. The trends in telemedicine use suggest that telemedicine-based health care delivery has become a mainstream and sustained supplement to in-person-based ambulatory care, particularly for younger patients, for both urgent and nonurgent care needs. These findings have implications for the health care delivery system, including practice leaders, insurers, and policymakers. Further investigation is needed to evaluate telemedicine adoption by key demographics, identify ongoing barriers to adoption, and explore the impacts of sustained use of telemedicine on health care outcomes and experience. 
    more » « less
  3. COVID-19, known as Coronavirus Disease 2019, is a major health issue resulting from novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. Its emergence has posed a significant menace to the global medical community and healthcare system across the world. Notably, on December 12, 2020, the Food and Drug Administration (FDA) approved the utilization of the Pfizer and Moderna COVID-19 vaccines. As of July 31, 2022, the United Stated has witnessed over 91.3 million cases of COVID-19 and nearly 1.03 million fatalities. An intriguing observation is the recent reduction in the mortality rate of COVID-19, attributed to an augmented focus on early detection, comprehensive screening, and widespread vaccination. Despite this positive trend in some demographics, it is noteworthy that the overall incidence rates of COVID-19 among African American and Hispanic populations have continued to escalate, even as mortality rates have decreased. Therefore, the objective of this research study is to present an overview of COVID-19, spotlighting the disparities among different racial and ethnic groups. It also delves into the management of COVID-19 within the minority populations. To reach our research objective, we used a publicly available COVID-19 dataset from kaggle: https://www.kaggle.com/datasets/paultimothymooney/covid19-casesand- deaths-by-race. In addition, we obtained COVID-19 datasets from 10 different states with the highest proportion of African American populations. Many considerable strikes have been made in COVID-19. However, success rate of treatment in the African American population remains relatively limited when compared to other ethnic groups. Hence, there arises a pressing need for novel strategies and innovative approaches to not only encourage prevention measures against COVID-19, but also to increase survival rates, diminish mortality rates, and ultimately improve the health outcomes of ethnic and racial minorities. 
    more » « less
  4. Abstract The Coronavirus Disease 2019 (COVID-19) has had a profound impact on global health and economy, making it crucial to build accurate and interpretable data-driven predictive models for COVID-19 cases to improve public policy making. The extremely large scale of the pandemic and the intrinsically changing transmission characteristics pose a great challenge for effectively predicting COVID-19 cases. To address this challenge, we propose a novel hybrid model in which the interpretability of the Autoregressive model (AR) and the predictive power of the long short-term memory neural networks (LSTM) join forces. The proposed hybrid model is formalized as a neural network with an architecture that connects two composing model blocks, of which the relative contribution is decided data-adaptively in the training procedure. We demonstrate the favorable performance of the hybrid model over its two single composing models as well as other popular predictive models through comprehensive numerical studies on two data sources under multiple evaluation metrics. Specifically, in county-level data of 8 California counties, our hybrid model achieves 4.173% MAPE, outperforming the composing AR (5.629%) and LSTM (4.934%) alone on average. In country-level datasets, our hybrid model outperforms the widely-used predictive models such as AR, LSTM, Support Vector Machines, Gradient Boosting, and Random Forest, in predicting the COVID-19 cases in Japan, Canada, Brazil, Argentina, Singapore, Italy, and the United Kingdom. In addition to the predictive performance, we illustrate the interpretability of our proposed hybrid model using the estimated AR component, which is a key feature that is not shared by most black-box predictive models for COVID-19 cases. Our study provides a new and promising direction for building effective and interpretable data-driven models for COVID-19 cases, which could have significant implications for public health policy making and control of the current COVID-19 and potential future pandemics. 
    more » « less
  5. Background: The current study explores how characteristics of individuals, their communities, and their relative exposure to nearby Covid-19 cases are associated with specific fears or perceived threat/risk of the virus itself during the early stages of the pandemic in March 2020. Aims: Drawing from research emphasizing the intersectional relationships between individual social vulnerabilities, community characteristics, and Covid-19 outbreak locales, we test several hypotheses predicting fear. Method: Using data from a large-scale survey of 10,368 U.S. adults from March 2020, we construct a series of hierarchical linear and logistic regression models that nest individuals within their residential counties in order to account for key socio-demographic characteristics of individuals, communities, and each respondent’s geographic proximity to Covid-19 cases. Results: Results show that individual fear and perceived risk to oneself and family is predicted by individual social vulnerabilities, the type of community in which respondents live, and the relative presence of the virus in nearby places. Conclusion: Our findings highlight the importance of understanding fear, particularly as a possible mediator for both mental and physical health outcomes. Likewise, we emphasize ongoing efforts aimed at understanding how different groups and communities respond to fear and/or concern over Covid-19 as the pandemic remains ongoing. 
    more » « less