skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Spatiotemporal prediction of foot traffic
Foot traffic is a business term to describe the number of customers that enter a point of interest (POI). This work aims to predict future foot traffic: the number of people from each census block group (CBG) that will visit each POI of a study region with potential applications in marketing and advertising. Existing techniques for spatiotemporal prediction of foot traffic use location-based social network data that suffer from sparsity, capturing only a handful of visits per day. This study utilizes highly granular foot traffic data from SafeGraph, a data company that collects mobility data regarding hundreds of millions of visits per day in the United States alone. Using this data, we explore solutions to predict weekly foot traffic data at the POI level. We propose a collaborative filtering approach using tensor factorization on the (POIs x CBGs x Weeks) data tensor. This approach provides us with a de-noised estimation of visits in previous weeks for all POI-CBG pairs. Using this tensor, we explore various time series prediction models: weekly rolling average, weighted weekly rolling average, univariate linear regression, polynomial regression, and long short-term memory (LSTM) recurrent neural networks. Our results show that of all the prediction models, the collaborative filtering step consistently improves prediction results. We also found that a simple weighted average consistently performed better than the more sophisticated approaches. Given this abundance of foot traffic data, this result shows that we can improve the spatiotemporal prediction of foot traffic data by harnessing collaborative filtering.  more » « less
Award ID(s):
2030685 2109647
PAR ID:
10302634
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
LocalRec '21: Proceedings of the 5th ACM SIGSPATIAL International Workshop on Location-based Recommendations, Geosocial Networks and Geoadvertising
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Agent-based models (ABM) play a prominent role in guiding critical decision-making and supporting the development of effective policies for better urban resilience and response to the COVID-19 pandemic. However, many ABMs lack realistic representations of human mobility, a key process that leads to physical interaction and subsequent spread of disease. Therefore, we propose the application of Latent Dirichlet Allocation (LDA), a topic modeling technique, to foot-traffic data to develop a realistic model of human mobility in an ABM that simulates the spread of COVID-19. In our novel approach, LDA treats POIs as "words" and agent home census block groups (CBGs) as "documents" to extract "topics" of POIs that frequently appear together in CBG visits. These topics allow us to simulate agent mobility based on the LDA topic distribution of their home CBG. We compare the LDA based mobility model with competitor approaches including a naive mobility model that assumes visits to POIs are random. We find that the naive mobility model is unable to facilitate the spread of COVID-19 at all. Using the LDA informed mobility model, we simulate the spread of COVID-19 and test the effect of changes to the number of topics, various parameters, and public health interventions. By examining the simulated number of cases over time, we find that the number of topics does indeed impact disease spread dynamics, but only in terms of the outbreak's timing. Further analysis of simulation results is needed to better understand the impact of topics on simulated COVID-19 spread. This study contributes to strengthening human mobility representations in ABMs of disease spread. 
    more » « less
  2. Automated cough detection has significant applications for the surveillance of diseases and supports medical decisions, as cough sounds can be a useful biomarker. However, the implementation and evaluation of robust cough detection models can be challenging due to the lack of real-world data. This paper introduces and makes available a collection of 2,883 coughs and 3,074 non-cough sounds recorded in clinic waiting rooms that we hope will become a baseline for this task. Using this dataset, we evaluate different convolutional network architectures for classifying short audio segments as cough or non-cough. An ensemble model of convolutional neuronal networks provides the most robust performance and has a ROC AUC of $$98.1\%$$. Equally important, we construct a cough counter that incorporates the ensemble model to compute the number of coughs per day. Then, a simple linear model estimates the number of visits in which the patients report cough symptoms from the cough counts. This simple regression model can predict the number of cough visits in the clinic with an absolute mean error of 4.26 cough visits per day. Using additional information about when patients are in the clinic helps a similar regression model reach a mean absolute error of 3.65 cough visits per day. These results demonstrate the feasibility of using cough detection as a biomarker for the spread of respiratory viruses within the community. 
    more » « less
  3. Our goal in this work is to build effective yet robust models to predict unreliable and inconsistent in-kind donations at both weekly and monthly levels for two food banks across coasts: the Food Bank of Central Eastern North Carolina in North Carolina and Los Angeles Regional Food Bank in California. We explore three factors: model, data length, and window type. For the model, we evaluate a series of classic time-series forecasting models against the state-of-the-art approaches such as Bayesian Structural Time Series modeling (BSTS) and deep learning models; for the data length, we vary training data from 2 weeks to 13 years; for the window type, we compare sliding vs. expanding. Our results show the effectiveness of different models heavily depends on the data length and the window type as well as characteristics of the food bank. Motivated by these findings, we investigate the effectiveness of employing an average of all predictions formed by considering all three factors at both monthly and weekly levels for both food banks. Our results show that this average of predictions significantly and consistently outperforms all classical models, deep learning, and BSTS for the donation prediction at both monthly and weekly levels for both food banks. 
    more » « less
  4. As various smart services are increasingly deployed in modern cities, many unexpected conflicts arise due to various physical world couplings. Existing solutions for conflict resolution often rely on centralized control to enforce predetermined and fixed priorities of different services, which is challenging due to the inconsistent and private objectives of the services. Also, the centralized solutions miss opportunities to more effectively resolve conflicts according to their spatiotemporal locality of the conflicts. To address this issue, we design a decentralized negotiation and conflict resolution framework named DeResolver, which allows services to resolve conflicts by communicating and negotiating with each other to reach a Pareto-optimal agreement autonomously and efficiently. Our design features a two-step self-supervised learning-based algorithm to predict acceptable proposals and their rankings of each opponent through the negotiation. Our design is evaluated with a smart city case study of three services: intelligent traffic light control, pedestrian service, and environmental control. In this case study, a data-driven evaluation is conducted using a large dataset consisting of the GPS locations of 246 surveillance cameras and an automatic traffic monitoring system with more than 3 million records per day to extract real-world vehicle routes. The evaluation results show that our solution achieves much more balanced results, i.e., only increasing the average waiting time of vehicles, the measurement metric of intelligent traffic light control service, by 6.8% while reducing the weighted sum of air pollutant emission, measured for environment control service, by 12.1%, and the pedestrian waiting time, the measurement metric of pedestrian service, by 33.1%, compared to priority-based solution. 
    more » « less
  5. Abstract Non-pharmacologic interventions (NPIs) promote protective actions to lessen exposure risk to COVID-19 by reducing mobility patterns. However, there is a limited understanding of the underlying mechanisms associated with reducing mobility patterns especially for socially vulnerable populations. The research examines two datasets at a granular scale for five urban locations. Through exploratory analysis of networks, statistics, and spatial clustering, the research extensively investigates the exposure risk reduction after the implementation of NPIs to socially vulnerable populations, specifically lower income and non-white populations. The mobility dataset tracks population movement across ZIP codes for an origin–destination (O–D) network analysis. The population activity dataset uses the visits from census block groups (cbg) to points-of-interest (POIs) for network analysis of population-facilities interactions. The mobility dataset originates from a collaboration with StreetLight Data, a company focusing on transportation analytics, whereas the population activity dataset originates from a collaboration with SafeGraph, a company focusing on POI data. Both datasets indicated that low-income and non-white populations faced higher exposure risk. These findings can assist emergency planners and public health officials in comprehending how different populations are able to implement protective actions and it can inform more equitable and data-driven NPI policies for future epidemics. 
    more » « less