skip to main content


Title: Measuring global multi-scale place connectivity using geotagged social media data
Abstract Shaped by human movement, place connectivity is quantified by the strength of spatial interactions among locations. For decades, spatial scientists have researched place connectivity, applications, and metrics. The growing popularity of social media provides a new data stream where spatial social interaction measures are largely devoid of privacy issues, easily assessable, and harmonized. In this study, we introduced a global multi-scale place connectivity index (PCI) based on spatial interactions among places revealed by geotagged tweets as a spatiotemporal-continuous and easy-to-implement measurement. The multi-scale PCI, demonstrated at the US county level, exhibits a strong positive association with SafeGraph population movement records (10% penetration in the US population) and Facebook’s social connectedness index (SCI), a popular connectivity index based on social networks. We found that PCI has a strong boundary effect and that it generally follows the distance decay, although this force is weaker in more urbanized counties with a denser population. Our investigation further suggests that PCI has great potential in addressing real-world problems that require place connectivity knowledge, exemplified with two applications: (1) modeling the spatial spread of COVID-19 during the early stage of the pandemic and (2) modeling hurricane evacuation destination choice. The methodological and contextual knowledge of PCI, together with the open-sourced PCI datasets at various geographic levels, are expected to support research fields requiring knowledge in human spatial interactions.  more » « less
Award ID(s):
2028791
NSF-PAR ID:
10345855
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
Scientific Reports
Volume:
11
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract This project is funded by the US National Science Foundation (NSF) through their NSF RAPID program under the title “Modeling Corona Spread Using Big Data Analytics.” The project is a joint effort between the Department of Computer & Electrical Engineering and Computer Science at FAU and a research group from LexisNexis Risk Solutions. The novel coronavirus Covid-19 originated in China in early December 2019 and has rapidly spread to many countries around the globe, with the number of confirmed cases increasing every day. Covid-19 is officially a pandemic. It is a novel infection with serious clinical manifestations, including death, and it has reached at least 124 countries and territories. Although the ultimate course and impact of Covid-19 are uncertain, it is not merely possible but likely that the disease will produce enough severe illness to overwhelm the worldwide health care infrastructure. Emerging viral pandemics can place extraordinary and sustained demands on public health and health systems and on providers of essential community services. Modeling the Covid-19 pandemic spread is challenging. But there are data that can be used to project resource demands. Estimates of the reproductive number (R) of SARS-CoV-2 show that at the beginning of the epidemic, each infected person spreads the virus to at least two others, on average (Emanuel et al. in N Engl J Med. 2020, Livingston and Bucher in JAMA 323(14):1335, 2020). A conservatively low estimate is that 5 % of the population could become infected within 3 months. Preliminary data from China and Italy regarding the distribution of case severity and fatality vary widely (Wu and McGoogan in JAMA 323(13):1239–42, 2020). A recent large-scale analysis from China suggests that 80 % of those infected either are asymptomatic or have mild symptoms; a finding that implies that demand for advanced medical services might apply to only 20 % of the total infected. Of patients infected with Covid-19, about 15 % have severe illness and 5 % have critical illness (Emanuel et al. in N Engl J Med. 2020). Overall, mortality ranges from 0.25 % to as high as 3.0 % (Emanuel et al. in N Engl J Med. 2020, Wilson et al. in Emerg Infect Dis 26(6):1339, 2020). Case fatality rates are much higher for vulnerable populations, such as persons over the age of 80 years (> 14 %) and those with coexisting conditions (10 % for those with cardiovascular disease and 7 % for those with diabetes) (Emanuel et al. in N Engl J Med. 2020). Overall, Covid-19 is substantially deadlier than seasonal influenza, which has a mortality of roughly 0.1 %. Public health efforts depend heavily on predicting how diseases such as those caused by Covid-19 spread across the globe. During the early days of a new outbreak, when reliable data are still scarce, researchers turn to mathematical models that can predict where people who could be infected are going and how likely they are to bring the disease with them. These computational methods use known statistical equations that calculate the probability of individuals transmitting the illness. Modern computational power allows these models to quickly incorporate multiple inputs, such as a given disease’s ability to pass from person to person and the movement patterns of potentially infected people traveling by air and land. This process sometimes involves making assumptions about unknown factors, such as an individual’s exact travel pattern. By plugging in different possible versions of each input, however, researchers can update the models as new information becomes available and compare their results to observed patterns for the illness. In this paper we describe the development a model of Corona spread by using innovative big data analytics techniques and tools. We leveraged our experience from research in modeling Ebola spread (Shaw et al. Modeling Ebola Spread and Using HPCC/KEL System. In: Big Data Technologies and Applications 2016 (pp. 347-385). Springer, Cham) to successfully model Corona spread, we will obtain new results, and help in reducing the number of Corona patients. We closely collaborated with LexisNexis, which is a leading US data analytics company and a member of our NSF I/UCRC for Advanced Knowledge Enablement. The lack of a comprehensive view and informative analysis of the status of the pandemic can also cause panic and instability within society. Our work proposes the HPCC Systems Covid-19 tracker, which provides a multi-level view of the pandemic with the informative virus spreading indicators in a timely manner. The system embeds a classical epidemiological model known as SIR and spreading indicators based on causal model. The data solution of the tracker is built on top of the Big Data processing platform HPCC Systems, from ingesting and tracking of various data sources to fast delivery of the data to the public. The HPCC Systems Covid-19 tracker presents the Covid-19 data on a daily, weekly, and cumulative basis up to global-level and down to the county-level. It also provides statistical analysis for each level such as new cases per 100,000 population. The primary analysis such as Contagion Risk and Infection State is based on causal model with a seven-day sliding window. Our work has been released as a publicly available website to the world and attracted a great volume of traffic. The project is open-sourced and available on GitHub. The system was developed on the LexisNexis HPCC Systems, which is briefly described in the paper. 
    more » « less
  2. null (Ed.)
    Research on citizen science programmes has highlighted that they can foster science content and knowledge gain, enhance pro-environmental behaviour and cultivate civic action among participants. Especially in the case of place-based citizen science, which requires hands-on repeated activity in an out-of-door setting through a scientific lens, evidence suggests that some of these outcomes may be linked to the unique people–place relationships and interactions afforded by such programmes. Even still, studies that empirically examine the influence of place on citizen science participant and programme outcomes are scant. This is due, in part, to the methodological challenges involved in interrogating complex aspects of a person's sense of place—aspects like place attachment—the emotional bonds between people and place. Here, an adapted three-dimensional model of place attachment is proposed as a theoretical framework from which place-based citizen science experiences and outcomes might be empirically examined in depth. The model, which posits personal, social and natural environment dimensions of place attachment is contextualized with research findings from the US-based Coastal Observation and Seabird Survey Team (COASST) citizen science programme. Data from COASST suggest that participants do exhibit place attachment in all three dimensions of attachment, categorized within seven unique constructs, although questions remain regarding the unique intensity, make-up (shape) and scale (spatial, social and nature-science) of individual-level attachment along the three central dimensions. Critically, more research is needed to investigate whether the unique place attachment ‘profile’ of participants is a function of personal, social or programmatic variables pre- and post-programme participation. To encourage further scholarship on potential links between the experiences, exposures and programme components of place-based citizen science and the place attachment profiles of participants, this paper includes a brief review of the research opportunities presented by the adapted three-dimensional place attachment model discussed. Advancing this line of inquiry is an important component of broader efforts to understand how sense of place is altered via place-based citizen science and whether or not that is linked to specific programme outputs or participant outcomes in science knowledge, ecological understanding and civic engagement. 
    more » « less
  3. null (Ed.)
    The nexus of food, energy, and water systems (FEWS) has become a salient research topic, as well as a pressing societal and policy challenge. Computational modeling is a key tool in addressing these challenges, and FEWS modeling as a subfield is now established. However, social dimensions of FEWS nexus issues, such as individual or social learning, technology adoption decisions, and adaptive behaviors, remain relatively underdeveloped in FEWS modeling and research. Agent-based models (ABMs) have received increasing usage recently in efforts to better represent and integrate human behavior into FEWS research. A systematic review identified 29 articles in which at least two food, energy, or water sectors were explicitly considered with an ABM and/or ABM-coupled modeling approach. Agent decision-making and behavior ranged from reactive to active, motivated by primarily economic objectives to multi-criteria in nature, and implemented with individual-based to highly aggregated entities. However, a significant proportion of models did not contain agent interactions, or did not base agent decision-making on existing behavioral theories. Model design choices imposed by data limitations, structural requirements for coupling with other simulation models, or spatial and/or temporal scales of application resulted in agent representations lacking explicit decision-making processes or social interactions. In contrast, several methodological innovations were also noted, which were catalyzed by the challenges associated with developing multi-scale, cross-sector models. Several avenues for future research with ABMs in FEWS research are suggested based on these findings. The reviewed ABM applications represent progress, yet many opportunities for more behaviorally rich agent-based modeling in the FEWS context remain. 
    more » « less
  4. Abstract Background West Nile virus (WNV) is the leading cause of mosquito-borne illness in the continental USA. WNV occurrence has high spatiotemporal variation, and current approaches to targeted control of the virus are limited, making forecasting a public health priority. However, little research has been done to compare strengths and weaknesses of WNV disease forecasting approaches on the national scale. We used forecasts submitted to the 2020 WNV Forecasting Challenge, an open challenge organized by the Centers for Disease Control and Prevention, to assess the status of WNV neuroinvasive disease (WNND) prediction and identify avenues for improvement. Methods We performed a multi-model comparative assessment of probabilistic forecasts submitted by 15 teams for annual WNND cases in US counties for 2020 and assessed forecast accuracy, calibration, and discriminatory power. In the evaluation, we included forecasts produced by comparison models of varying complexity as benchmarks of forecast performance. We also used regression analysis to identify modeling approaches and contextual factors that were associated with forecast skill. Results Simple models based on historical WNND cases generally scored better than more complex models and combined higher discriminatory power with better calibration of uncertainty. Forecast skill improved across updated forecast submissions submitted during the 2020 season. Among models using additional data, inclusion of climate or human demographic data was associated with higher skill, while inclusion of mosquito or land use data was associated with lower skill. We also identified population size, extreme minimum winter temperature, and interannual variation in WNND cases as county-level characteristics associated with variation in forecast skill. Conclusions Historical WNND cases were strong predictors of future cases with minimal increase in skill achieved by models that included other factors. Although opportunities might exist to specifically improve predictions for areas with large populations and low or high winter temperatures, areas with high case-count variability are intrinsically more difficult to predict. Also, the prediction of outbreaks, which are outliers relative to typical case numbers, remains difficult. Further improvements to prediction could be obtained with improved calibration of forecast uncertainty and access to real-time data streams (e.g. current weather and preliminary human cases). Graphical Abstract 
    more » « less
  5. null (Ed.)
    Abstract The objective of this study is to examine the transmission risk of COVID-19 based on cross-county population co-location data from Facebook. The rapid spread of COVID-19 in the United States has imposed a major threat to public health, the real economy, and human well-being. With the absence of effective vaccines, the preventive actions of social distancing, travel reduction and stay-at-home orders are recognized as essential non-pharmacologic approaches to control the infection and spatial spread of COVID-19. Prior studies demonstrated that human movement and mobility drove the spatiotemporal distribution of COVID-19 in China. Little is known, however, about the patterns and effects of co-location reduction on cross-county transmission risk of COVID-19. This study utilizes Facebook co-location data for all counties in the United States from March to early May 2020 for conducting spatial network analysis where nodes represent counties and edge weights are associated with the co-location probability of populations of the counties. The analysis examines the synchronicity and time lag between travel reduction and pandemic growth trajectory to evaluate the efficacy of social distancing in ceasing the population co-location probabilities, and subsequently the growth in weekly new cases across counties. The results show that the mitigation effects of co-location reduction appear in the growth of weekly new confirmed cases with one week of delay. The analysis categorizes counties based on the number of confirmed COVID-19 cases and examines co-location patterns within and across groups. Significant segregation is found among different county groups. The results suggest that within-group co-location probabilities (e.g., co-location probabilities among counties with high numbers of cases) remain stable, and social distancing policies primarily resulted in reduced cross-group co-location probabilities (due to travel reduction from counties with large number of cases to counties with low numbers of cases). These findings could have important practical implications for local governments to inform their intervention measures for monitoring and reducing the spread of COVID-19, as well as for adoption in future pandemics. Public policy, economic forecasting, and epidemic modeling need to account for population co-location patterns in evaluating transmission risk of COVID-19 across counties. 
    more » « less