skip to main content


Title: Leveraging network representation learning and community detection for analyzing the activity profiles of adolescents
Abstract

Human mobility analysis plays a crucial role in urban analysis, city planning, epidemic modeling, and even understanding neighborhood effects on individuals’ health. Often, these studies model human mobility in the form of co-location networks. We have recently seen the tremendous success of network representation learning models on several machine learning tasks on graphs. To the best of our knowledge, limited attention has been paid to identifying communities using network representation learning methods specifically for co-location networks. We attempt to address this problem and study user mobility behavior through the communities identified with latent node representations. Specifically, we select several diverse network representation learning models to identify communities from a real-world co-location network. We include both general-purpose representation models that make no assumptions on network modality as well as approaches designed specifically for human mobility analysis. We evaluate these different methods on data collected in the Adolescent Health and Development in Context study. Our experimental analysis reveals that a recently proposed method (LocationTrails) offers a competitive advantage over other methods with respect to its ability to represent and reflect community assignment that is consistent with extant findings regarding neighborhood racial and socio-economic differences in mobility patterns. We also compare the learned activity profiles of individuals by factoring in their residential neighborhoods. Our analysis reveals a significant contrast in the activity profiles of individuals residing in white-dominated versus black-dominated neighborhoods and advantaged versus disadvantaged neighborhoods in a major metropolitan city of United States. We provide a clear rationale for this contrastive pattern through insights from the sociological literature.

 
more » « less
NSF-PAR ID:
10366713
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
Applied Network Science
Volume:
7
Issue:
1
ISSN:
2364-8228
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Online social communities are becoming windows for learning more about the health of populations, through information about our health-related behaviors and outcomes from daily life. At the same time, just as public health data and theory has shown that aspects of the built environment can affect our health-related behaviors and outcomes, it is also possible that online social environments (e.g., posts and other attributes of our online social networks) can also shape facets of our life. Given the important role of the online environment in public health research and implications, factors which contribute to the generation of such data must be well understood. Here we study the role of the built and online social environments in the expression of dining on Instagram in Abu Dhabi; a ubiquitous social media platform, city with a vibrant dining culture, and a topic (food posts) which has been studied in relation to public health outcomes. Our study uses available data on user Instagram profiles and their Instagram networks, as well as the local food environment measured through the dining types (e.g., casual dining restaurants, food court restaurants, lounges etc.) by neighborhood. We find evidence that factors of the online social environment (profiles that post about dining versus profiles that do not post about dining) have different influences on the relationship between a user’s built environment and the social dining expression, with effects also varying by dining types in the environment and time of day. We examine the mechanism of the relationships via moderation and mediation analyses. Overall, this study provides evidence that the interplay of online and built environments depend on attributes of said environments and can also vary by time of day. We discuss implications of this synergy for precisely-targeting public health interventions, as well as on using online data for public health research. 
    more » « less
  2. Abstract. Advances in ambient environmental monitoring technologies are enabling concerned communities and citizens to collect data to better understand their local environment and potential exposures. These mobile, low-cost tools make it possible to collect data with increased temporal and spatial resolution, providing data on a large scale with unprecedented levels of detail. This type of data has the potential to empower people to make personal decisions about their exposure and support the development of local strategies for reducing pollution and improving health outcomes. However, calibration of these low-cost instruments has been a challenge. Often, a sensor package is calibrated via field calibration. This involves colocating the sensor package with a high-quality reference instrument for an extended period and then applying machine learning or other model fitting technique such as multiple linear regression to develop a calibration model for converting raw sensor signals to pollutant concentrations. Although this method helps to correct for the effects of ambient conditions (e.g., temperature) and cross sensitivities with nontarget pollutants, there is a growing body of evidence that calibration models can overfit to a given location or set of environmental conditions on account of the incidental correlation between pollutant levels and environmental conditions, including diurnal cycles. As a result, a sensor package trained at a field site may provide less reliable data when moved, or transferred, to a different location. This is a potential concern for applications seeking to perform monitoring away from regulatory monitoring sites, such as personal mobile monitoring or high-resolution monitoring of a neighborhood. We performed experiments confirming that transferability is indeed a problem and show that it can be improved by collecting data from multiple regulatory sites and building a calibration model that leverages data from a more diverse data set. We deployed three sensor packages to each of three sites with reference monitors (nine packages total) and then rotated the sensor packages through the sites over time. Two sites were in San Diego, CA, with a third outside of Bakersfield, CA, offering varying environmental conditions, general air quality composition, and pollutant concentrations. When compared to prior single-site calibration, the multisite approach exhibits better model transferability for a range of modeling approaches. Our experiments also reveal that random forest is especially prone to overfitting and confirm prior results that transfer is a significant source of both bias and standard error. Linear regression, on the other hand, although it exhibits relatively high error, does not degrade much in transfer. Bias dominated in our experiments, suggesting that transferability might be easily increased by detecting and correcting for bias. Also, given that many monitoring applications involve the deployment of many sensor packages based on the same sensing technology, there is an opportunity to leverage the availability of multiple sensors at multiple sites during calibration to lower the cost of training and better tolerate transfer. We contribute a new neural network architecture model termed split-NN that splits the model into two stages, in which the first stage corrects for sensor-to-sensor variation and the second stage uses the combined data of all the sensors to build a model for a single sensor package. The split-NN modeling approach outperforms multiple linear regression, traditional two- and four-layer neural networks, and random forest models. Depending on the training configuration, compared to random forest the split-NN method reduced error 0 %–11 % for NO2 and 6 %–13 % for O3. 
    more » « less
  3. Introduction Integrated social and ecological processes shape urban plant communities, but the temporal dynamics and potential for change in these managed communities have rarely been explored. In residential yards, which cover about 40% of urban land area, individuals make decisions that control vegetation outcomes. These decisions may lead to relatively static plant composition and structure, as residents seek to expend little effort to maintain stable landscapes. Alternatively, residents may actively modify plant communities to meet their preferences or address perceived problems, or they may passively allow them to change. In this research, we ask, how and to what extent does residential yard vegetation change over time? Methods We conducted co-located ecological surveys of yards (in 2008, 2018, and 2019) and social surveys of residents (in 2018) in four diverse neighborhoods of Phoenix, Arizona. Results 94% of residents had made some changes to their front or back yards since moving in. On average, about 60% of woody vegetation per yard changed between 2008 and 2018, though the number of species present did not differ significantly. In comparison, about 30% of woody vegetation changed in native Sonoran Desert reference areas over 10 years. In yards, about 15% of woody vegetation changed on average in a single year, with up to 90% change in some yards. Greater turnover was observed for homes that were sold, indicating a “pulse” of management. Additionally, we observed greater vegetation turnover in the two older, lawn-dominated neighborhoods surveyed despite differences in neighborhood socioeconomic factors. Discussion These results indicate that residential plant communities are dynamic over time. Neighborhood age and other characteristics may be important drivers of change, while socioeconomic status neither promotes nor inhibits change at the neighborhood scale. Our findings highlight an opportunity for management interventions, wherein residents may be open to making conservation-friendly changes if they are already altering the composition of their yards. 
    more » « less
  4. null (Ed.)
    Although there is increasing awareness of disparities in COVID-19 infection risk among vulnerable communities, the effect of behavioral interventions at the scale of individual neighborhoods has not been fully studied. We develop a method to quantify neighborhood activity behaviors at high spatial and temporal resolutions and test whether, and to what extent, behavioral responses to social-distancing policies vary with socioeconomic and demographic characteristics. We define exposure density ( E x ρ ) as a measure of both the localized volume of activity in a defined area and the proportion of activity occurring in distinct land-use types. Using detailed neighborhood data for New York City, we quantify neighborhood exposure density using anonymized smartphone geolocation data over a 3-mo period covering more than 12 million unique devices and rasterize granular land-use information to contextualize observed activity. Next, we analyze disparities in community social distancing by estimating variations in neighborhood activity by land-use type before and after a mandated stay-at-home order. Finally, we evaluate the effects of localized demographic, socioeconomic, and built-environment density characteristics on infection rates and deaths in order to identify disparities in health outcomes related to exposure risk. Our findings demonstrate distinct behavioral patterns across neighborhoods after the stay-at-home order and that these variations in exposure density had a direct and measurable impact on the risk of infection. Notably, we find that an additional 10% reduction in exposure density city-wide could have saved between 1,849 and 4,068 lives during the study period, predominantly in lower-income and minority communities. 
    more » « less
  5. null (Ed.)
    This article develops and assesses the concept of triple neighborhood disadvantage. We argue that a neighborhood’s well-being depends not only on its own socioeconomic conditions but also on the conditions of neighborhoods its residents visit and are visited by, connections that form through networks of everyday urban mobility. We construct measures of mobility-based disadvantage using geocoded patterns of movement estimated from hundreds of millions of tweets sent by nearly 400,000 Twitter users over 18 months. Analyzing nearly 32,000 neighborhoods and 9,700 homicides in 37 of the largest U.S. cities, we show that neighborhood triple disadvantage independently predicts homicides, adjusting for traditional neighborhood correlates of violence, spatial proximity to disadvantage, prior homicides, and city fixed effects. Not only is triple disadvantage a stronger predictor than traditional measures, it accounts for a sizable portion of the association between residential neighborhood disadvantage and homicides. In turn, potential mechanisms such as neighborhood drug activity, interpersonal friction, and gun crime prevalence account for much of the association between triple disadvantage and homicides. These findings implicate structural mobility patterns as an important source of triple (dis)advantage for neighborhoods and have implications for a broad range of phenomena beyond crime, including community capacity, gentrification, transmission in a pandemic, and racial inequality. 
    more » « less