skip to main content


Title: Predicting Urban Dispersal Events: A Two-Stage Framework through Deep Survival Analysis on Mobility Data
Urban dispersal events are processes where an unusually large number of people leave the same area in a short period. Early prediction of dispersal events is important in mitigating congestion and safety risks and making better dispatching decisions for taxi and ride-sharing fleets. Existing work mostly focuses on predicting taxi demand in the near future by learning patterns from historical data. However, they fail in case of abnormality because dispersal events with abnormally high demand are non-repetitive and violate common assumptions such as smoothness in demand change over time. Instead, in this paper we argue that dispersal events follow a complex pattern of trips and other related features in the past, which can be used to predict such events. Therefore, we formulate the dispersal event prediction problem as a survival analysis problem. We propose a two-stage framework (DILSA), where a deep learning model combined with survival analysis is developed to predict the probability of a dispersal event and its demand volume. We conduct extensive case studies and experiments on the NYC Yellow taxi dataset from 2014-2016. Results show that DILSA can predict events in the next 5 hours with F1-score of 0:7 and with average time error of 18 minutes. It is orders of magnitude better than the state-of-the-art deep learning approaches for taxi demand prediction.  more » « less
Award ID(s):
1657350
NSF-PAR ID:
10098324
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
the Thirty-Third AAAI Conference on Artificial Intelligence
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Urban dispersal events occur when an unexpectedly large number of people leave an area in a relatively short period of time. It is beneficial for the city authorities, such as law enforcement and city management, to have an advance knowledge of such events, as it can help them mitigate the safety risks and handle important challenges such as managing traffic, and so forth. Predicting dispersal events is also beneficial to Taxi drivers and/or ride-sharing services, as it will help them respond to an unexpected demand and gain competitive advantage. Large urban datasets such as detailed trip records and point of interest ( POI ) data make such predictions achievable. The related literature mainly focused on taxi demand prediction. The pattern of the demand was assumed to be repetitive and proposed methods aimed at capturing those patterns. However, dispersal events are, by definition, violations of those patterns and are, understandably, missed by the methods in the literature. We proposed a different approach in our prior work [32]. We showed that dispersal events can be predicted by learning the complex patterns of arrival and other features that precede them in time. We proposed a survival analysis formulation of this problem and proposed a two-stage framework (DILSA), where a deep learning model predicted the survival function at each point in time in the future. We used that prediction to determine the time of the dispersal event in the future, or its non-occurrence. However, DILSA is subject to a few limitations. First, based on evidence from the data, mobility patterns can vary through time at a given location. DILSA does not distinguish between different mobility patterns through time. Second, mobility patterns are also different for different locations. DILSA does not have the capability to directly distinguish between different locations based on their mobility patterns. In this article, we address these limitations by proposing a method to capture the interaction between POIs and mobility patterns and we create vector representations of locations based on their mobility patterns. We call our new method DILSA+. We conduct extensive case studies and experiments on the NYC Yellow taxi dataset from 2014 to 2016. Results show that DILSA+ can predict events in the next 5 hours with an F1-score of 0.66. It is significantly better than DILSA and the state-of-the-art deep learning approaches for taxi demand prediction. 
    more » « less
  2. Utilizing large-scale urban data sets to predict taxi and Uber passengers demand in cities is valuable for designing better taxi dispatch system and improving taxi services. In this paper, we predict taxi and Uber demand using two real-world data sets. Our approach consists of two key steps. First, we use temporal-correlated entropy to measure the demand regularity and obtain the maximum predictability. Second, we implement and assess five well-known representative predictors (Markov, LZW, ARIMA, MLP and LSTM) in achieving the maximum predictability. The results show that, on average, the maximum predictability can be as high as 83%, indicating a high temporal regularity of taxi demand in cities. In areas with low maximum predictability ( Πmax<0.83 ), the deep learning predictor LSTM can achieve high prediction accuracy by capturing hidden long-term temporal dependency. In areas with high maximum predictability ( Πmax⩾0.83 ), the Markov predictor can infer taxi demand with 86% accuracy, 14% better than LSTM, while requiring only 0.02% computation time. These findings suggest that the maximum predictability can help determine which predictor to use in terms of the accuracy and computational costs. 
    more » « less
  3. Abstract

    Event logs, comprising data on the occurrence of different types of events and associated times, are commonly collected during the operation of modern industrial machines and systems. It is widely believed that the rich information embedded in event logs can be used to predict the occurrence of critical events. In this paper, we propose a recurrent neural network model using time‐to‐event data from event logs not only to predict the time of the occurrence of a target event of interest, but also to interpret, from the trained model, significant events leading to the target event. To improve the performance of our model, sampling techniques and methods dealing with the censored data are utilized. The proposed model is tested on both simulated data and real‐world datasets. Through these comparison studies, we show that the deep learning approach can often achieve better prediction performance than the traditional statistical model, such as, the Cox proportional hazard model. The real‐world case study also shows that the model interpretation algorithm proposed in this work can reveal the underlying physical relationship among events.

     
    more » « less
  4. Crowdfunding has gained widespread attention in recent years. Despite the huge success of crowdfunding platforms, the percentage of projects that succeed in achieving their desired goal amount is only around 40%. Moreover, many of these crowdfunding platforms follow "all-or-nothing" policy which means the pledged amount is collected only if the goal is reached within a certain predefined time duration. Hence, estimating the probability of success for a project is one of the most important research challenges in the crowdfunding domain. To predict the project success, there is a need for new prediction models that can potentially combine the power of both classification (which incorporate both successful and failed projects) and regression (for estimating the time for success). In this paper, we formulate the project success prediction as a survival analysis problem and apply the censored regression approach where one can perform regression in the presence of partial information. We rigorously study the project success time distribution of crowdfunding data and show that the logistic and log-logistic distributions are a natural choice for learning from such data. We investigate various censored regression models using comprehensive data of 18K Kickstarter (a popular crowdfunding platform) projects and 116K corresponding tweets collected from Twitter. We show that the models that take complete advantage of both the successful and failed projects during the training phase will perform significantly better at predicting the success of future projects compared to the ones that only use the successful projects. We provide a rigorous evaluation on many sets of relevant features and show that adding few temporal features that are obtained at the project's early stages can dramatically improve the performance. 
    more » « less
  5. Searching for parking has been a problem faced by many drivers, especially in urban areas. With an increasing public demand for parking information and services, as well as the proliferation of advanced smartphones, a range of smartphone-based parking management services began to emerge. Funded by the National Science Foundation, our research aims to explore the potential of smartphone-based parking management services as a solution to parking problems, to deepen our understandings of travelers’ parking behaviors, and to further advance the analytical foundations and methodologies for modeling and assessing parking solutions. This paper summarizes progress and results from our research projects on smartphone-based parking management, including parking availability information prediction, parking searching strategy, the development of a mobile parking application, and our next steps to learn and discover new knowledge from its deployment. To predict future parking occupancy, we proposed a practical framework that integrates machine-learning techniques with a model-based core approach that explicitly models the stochastic parking process. The framework is able to predict future parking occupancy from historical occupancy data alone, and can handle complex arrival and departure patterns in real-world case studies, including special event. With the predicted probabilistic availability information, a cost-minimizing parking searching strategy is developed. The parking searching problem for an individual user is a stochastic Markov decision process and is formalized as a dynamic programming problem. The cost-minimizing parking searching strategy is solved by value iteration. Our simulated experiments showed that cost-minimizing strategy has the lowest expected cost but tends to direct a user to visit more parking facilities compared with two greedy strategies. Currently, we are working on implementing the predictive framework and the searching algorithm in a mobile phone application. We are working closely with Arizona State University (ASU) Parking and Transit Services to implement a three-stage pilot deployment of the prototype application around the ASU main campus. In the first stage, our application will provide real-time information and we will incorporate availability prediction and searching guidance in the second and third stages. Once the mobile application is deployed, it will provide unique opportunities to collect data on parking search behaviors, discover emerging scenarios of smartphone-based parking management services, and assess the impacts of such systems. 
    more » « less