skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on September 2, 2026

Title: Equivalence and Clustering in Worker Flows: Stochastic Blockmodels for the Analysis of Mobility Tables
Mobility scholars are increasingly turning to computational methods to analyze mobility tables. Most of these approaches start with the detection of mobility clusters, namely, sets of occupations within which the flow of workers is dense and across which the flow is sparse. Yet clustering is not the only way worker flows can be structured. This article shows how a degree-corrected stochastic blockmodel can detect patterns of mobility that are more general than clustering and consistent with the homogeneity criterion laid out by Goodman as well as the internal homogeneity thesis proposed by Breiger. Because of the intractable marginal likelihood of the model, parameters are estimated using a variational expectation maximization algorithm. Simulation results suggest the estimation algorithm successfully recovers (conditionally) stochastically equivalent mobility classes. Analysis of two real-world examples shows the model is able to detect meaningful mobility patterns, even in situations in which commonly used community detection algorithms fail.  more » « less
Award ID(s):
2116938 2335815 2116936
PAR ID:
10655858
Author(s) / Creator(s):
 
Publisher / Repository:
Sage Journals
Date Published:
Journal Name:
Sociological Methodology
ISSN:
0081-1750
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Mining spatiotemporal mobility patterns is crucial for optimizing urban planning, enhancing transportation systems, and improving public safety by providing useful insights into human movement and behavior over space and time. As an unsupervised learning technique, time series clustering has gained considerable attention due to its efficiency. However, the existing literature has often overlooked the inherent characteristics of mobility data, including high-dimensionality, noise, outliers, and time distortions. This oversight can lead to potentially large computational costs and inaccurate patterns. To address these challenges, this paper proposes a novel neural network-based method integrating temporal autoencoder and dynamic time warping-based K-means clustering algorithm to mutually promote each other for mining spatiotemporal mobility patterns. Comparative results showed that our proposed method outperformed several time series clustering techniques in accurately identifying mobility patterns on both synthetic and real-world data, which provides a reliable foundation for data-driven decision-making. Furthermore, we applied the method to monthly county-level mobility data during the COVID-19 pandemic in the U.S., revealing significant differences in mobility changes between rural and urban areas, as well as the impact of public response and health considerations on mobility patterns. 
    more » « less
  2. Modern smart cities are focusing on smart transportation solutions to detect and mitigate the effects of various traffic incidents in the city. To materialize this, roadside units and ambient trans-portation sensors are being deployed to collect vehicular data that provides real-time traffic monitoring. In this paper, we first propose a real-time data-driven anomaly-based traffic incident detection framework for a city-scale smart transportation system. Specifically, we propose an incremental region growing approximation algorithm for optimal Spatio-temporal clustering of road segments and their data; such that road segments are strategically divided into highly correlated clusters. The highly correlated clusters enable identifying a Pythagorean Mean-based invariant as an anomaly detection metric that is highly stable under no incidents but shows a deviation in the presence of incidents. We learn the bounds of the invariants in a robust manner such that anomaly detection can generalize to unseen events, even when learning from real noisy data. We perform extensive experimental validation using mobility data collected from the City of Nashville, Tennessee, and prove that the method can detect incidents within each cluster in real-time. 
    more » « less
  3. Modern smart cities need smart transportation solutions to quickly detect various traffic emergencies and incidents in the city to avoid cascading traffic disruptions. To materialize this, roadside units and ambient transportation sensors are being deployed to collect speed data that enables the monitoring of traffic conditions on each road segment. In this paper, we first propose a scalable data-driven anomaly-based traffic incident detection framework for a city-scale smart transportation system. Specifically, we propose an incremental region growing approximation algorithm for optimal Spatio-temporal clustering of road segments and their data; such that road segments are strategically divided into highly correlated clusters. The highly correlated clusters enable identifying a Pythagorean Mean-based invariant as an anomaly detection metric that is highly stable under no incidents but shows a deviation in the presence of incidents. We learn the bounds of the invariants in a robust manner such that anomaly detection can generalize to unseen events, even when learning from real noisy data. Second, using cluster-level detection, we propose a folded Gaussian classifier to pinpoint the particular segment in a cluster where the incident happened in an automated manner. We perform extensive experimental validation using mobility data collected from four cities in Tennessee, compare with the state-of-the-art ML methods, to prove that our method can detect incidents within each cluster in real-time and outperforms known ML methods. 
    more » « less
  4. In an indoor space, determining a person's mobility patterns has research significance and applicability in real-world scenarios. When mobility patterns are determined, layout optimization can be implemented in indoor spaces to improve efficiency. This research aimed to determine a person's path using Received Signal Strength Indicator (RSSI) data collected from Bluetooth-enabled mobile devices. Mobile app-based mobility detection using Bluetooth RSSI has the advantage of low cost and easy implementation. The research methodology involves developing a Bluetooth RSSI mobility application system to determine the path of a moving mobile device using a vectorized algorithm. The paper presents challenges in creating such a software system, its architecture, the data collection and analysis process, and the results of mobility detection. This research shows that Bluetooth-enabled mobile devices and Bluetooth RSSI data can be used to determine the path in an indoor space with workable accuracy. 
    more » « less
  5. Designing effective algorithms for community detection is an important and challenging problem in large-scale graphs, studied extensively in the literature. Various solutions have been proposed, but many of them are centralized with expensive procedures (requiring full knowledge of the input graph) and have a large running time. In this paper, we present a distributed algorithm for community detection in the stochastic block model (also called planted partition model), a widely-studied and canonical random graph model for community detection and clustering. Our algorithm called CDRW(Community Detection by Random Walks) is based on random walks, and is localized and lightweight, and easy to implement. A novel feature of the algorithm is that it uses the concept of local mixing time to identify the community around a given node. We present a rigorous theoretical analysis that shows that the algorithm can accurately identify the communities in the stochastic block model and characterize the model parameters where the algorithm works. We also present experimental results that validate our theoretical analysis. We also analyze the performance of our distributed algorithm under the CONGEST distributed model as well as the k-machine model, a model for large-scale distributed computations, and show that it can be efficiently implemented. 
    more » « less