Urban dispersal events occur when an unexpectedly large number of people leave an area in a relatively short period of time. It is beneficial for the city authorities, such as law enforcement and city management, to have an advance knowledge of such events, as it can help them mitigate the safety risks and handle important challenges such as managing traffic, and so forth. Predicting dispersal events is also beneficial to Taxi drivers and/or ride-sharing services, as it will help them respond to an unexpected demand and gain competitive advantage. Large urban datasets such as detailed trip records and point of interest ( POI ) data make such predictions achievable. The related literature mainly focused on taxi demand prediction. The pattern of the demand was assumed to be repetitive and proposed methods aimed at capturing those patterns. However, dispersal events are, by definition, violations of those patterns and are, understandably, missed by the methods in the literature. We proposed a different approach in our prior work [32]. We showed that dispersal events can be predicted by learning the complex patterns of arrival and other features that precede them in time. We proposed a survival analysis formulation of this problem and proposed a two-stage framework (DILSA), where a deep learning model predicted the survival function at each point in time in the future. We used that prediction to determine the time of the dispersal event in the future, or its non-occurrence. However, DILSA is subject to a few limitations. First, based on evidence from the data, mobility patterns can vary through time at a given location. DILSA does not distinguish between different mobility patterns through time. Second, mobility patterns are also different for different locations. DILSA does not have the capability to directly distinguish between different locations based on their mobility patterns. In this article, we address these limitations by proposing a method to capture the interaction between POIs and mobility patterns and we create vector representations of locations based on their mobility patterns. We call our new method DILSA+. We conduct extensive case studies and experiments on the NYC Yellow taxi dataset from 2014 to 2016. Results show that DILSA+ can predict events in the next 5 hours with an F1-score of 0.66. It is significantly better than DILSA and the state-of-the-art deep learning approaches for taxi demand prediction. 
                        more » 
                        « less   
                    
                            
                            Predicting Urban Dispersal Events: A Two-Stage Framework through Deep Survival Analysis on Mobility Data
                        
                    
    
            Urban dispersal events are processes where an unusually large number of people leave the same area in a short period. Early prediction of dispersal events is important in mitigating congestion and safety risks and making better dispatching decisions for taxi and ride-sharing fleets. Existing work mostly focuses on predicting taxi demand in the near future by learning patterns from historical data. However, they fail in case of abnormality because dispersal events with abnormally high demand are non-repetitive and violate common assumptions such as smoothness in demand change over time. Instead, in this paper we argue that dispersal events follow a complex pattern of trips and other related features in the past, which can be used to predict such events. Therefore, we formulate the dispersal event prediction problem as a survival analysis problem. We propose a two-stage framework (DILSA), where a deep learning model combined with survival analysis is developed to predict the probability of a dispersal event and its demand volume. We conduct extensive case studies and experiments on the NYC Yellow taxi dataset from 2014-2016. Results show that DILSA can predict events in the next 5 hours with F1-score of 0:7 and with average time error of 18 minutes. It is orders of magnitude better than the state-of-the-art deep learning approaches for taxi demand prediction. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1657350
- PAR ID:
- 10098324
- Date Published:
- Journal Name:
- the Thirty-Third AAAI Conference on Artificial Intelligence
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Utilizing large-scale urban data sets to predict taxi and Uber passengers demand in cities is valuable for designing better taxi dispatch system and improving taxi services. In this paper, we predict taxi and Uber demand using two real-world data sets. Our approach consists of two key steps. First, we use temporal-correlated entropy to measure the demand regularity and obtain the maximum predictability. Second, we implement and assess five well-known representative predictors (Markov, LZW, ARIMA, MLP and LSTM) in achieving the maximum predictability. The results show that, on average, the maximum predictability can be as high as 83%, indicating a high temporal regularity of taxi demand in cities. In areas with low maximum predictability ( Πmax<0.83 ), the deep learning predictor LSTM can achieve high prediction accuracy by capturing hidden long-term temporal dependency. In areas with high maximum predictability ( Πmax⩾0.83 ), the Markov predictor can infer taxi demand with 86% accuracy, 14% better than LSTM, while requiring only 0.02% computation time. These findings suggest that the maximum predictability can help determine which predictor to use in terms of the accuracy and computational costs.more » « less
- 
            Abstract Event logs, comprising data on the occurrence of different types of events and associated times, are commonly collected during the operation of modern industrial machines and systems. It is widely believed that the rich information embedded in event logs can be used to predict the occurrence of critical events. In this paper, we propose a recurrent neural network model using time‐to‐event data from event logs not only to predict the time of the occurrence of a target event of interest, but also to interpret, from the trained model, significant events leading to the target event. To improve the performance of our model, sampling techniques and methods dealing with the censored data are utilized. The proposed model is tested on both simulated data and real‐world datasets. Through these comparison studies, we show that the deep learning approach can often achieve better prediction performance than the traditional statistical model, such as, the Cox proportional hazard model. The real‐world case study also shows that the model interpretation algorithm proposed in this work can reveal the underlying physical relationship among events.more » « less
- 
            A point-to-point process describes a dynamic network where a set of edge events are observed, each of which is associated with a time of occurrence and two vertices lying in their state spaces. This study intends to investigate one application of such processes, using NYC Taxi and Limousine Commission dataset that reports taxi trips between two locations at a certain time. Here a point-to-point process is formed with edge events being taxi trips and the vertices adjacent to the edge events are pick-up and drop-off locations, described by latitude and longitude pairs. The intensity of an edge event can have a temporal dependence in addition to being dependent on a latent, spatially-coherent community structure for the vertices. To this end, we have developed a methodology that estimates a spatially smoothed community structure and localizes temporal change-points for point-to-point processes. By applying this to our dataset, we can explore the spatio-temporal dynamics of the demand of taxi trips. More specifically, with reasonable assumptions, the latent community structure is estimated by spectral partitioning based on a low-rank reconstruction of aggregated taxi-trip network; and the temporal change-point localization can be carried out by solving a matrix group fused LASSO program.more » « less
- 
            Predicting the occurrence of a particular event of interest at future time points is the primary goal of survival analysis. The presence of incomplete observations due to time limitations or loss of data traces is known as censoring which brings unique challenges in this domain and differentiates survival analysis from other standard regression methods. The popularly used survival analysis methods such as Cox proportional hazard model and parametric survival regression suffer from some strict assumptions and hypotheses that are not realistic in most of the real-world applications. To overcome the weaknesses of these two types of methods, in this paper, we reformulate the survival analysis problem as a multi-task learning problem and propose a new multi-task learning based formulation to predict the survival time by estimating the survival status at each time interval during the study duration. We propose an indicator matrix to enable the multi-task learning algorithm to handle censored instances and incorporate some of the important characteristics of survival problems such as non-negative non-increasing list structure into our model through max-heap projection. We employ the L2,1-norm penalty which enables the model to learn a shared representation across related tasks and hence select important features and alleviate over-fitting in high-dimensional feature spaces; thus, reducing the prediction error of each task. To efficiently handle the two non-smooth constraints, in this paper, we propose an optimization method which employs Alternating Direction Method of Multipliers (ADMM) algorithm to solve the proposed multi-task learning problem. We demonstrate the performance of the proposed method using real-world microarray gene expression high-dimensional benchmark datasets and show that our method outperforms state-of-the-art methods.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    