Effective public transit operations are one of the fundamental requirements for a modern community. Recently, a number of transit agencies have started integrating automated vehicle locators in their fleet, which provides a real-time estimate of the time of arrival. In this paper, we use the data collected over several months from one such transit system and show how this data can be potentially used to learn long term patterns of travel time. More specifically, we study the effect of weather and other factors such as traffic on the transit system delay. These models can later be used to understand the seasonal variations and to design adaptive and transient transit schedules. Towards this goal, we also propose an online architecture called DelayRadar. The novelty of DelayRadar lies in three aspects: (1) a data store that collects and integrates real-time and static data from multiple data sources, (2) a predictive statistical model that analyzes the data to make predictions on transit travel time, and (3) a decision making framework to develop an optimal transit schedule based on variable forecasts related to traffic, weather, and other impactful factors. This paper focuses on identifying the model with the best predictive accuracy to be used in DelayRadar. According to the preliminary study results, we are able to explain more than 70% of the variance in the bus travel time and we can make future travel predictions with an out-of-sample error of 4.8 minutes with information on the bus schedule, traffic, and weather.
more »
« less
Transit-hub: a smart public transportation decision support system with multi-timescale analytical services
Public transit is a critical component of a smart and connected community. As such, citizens expect and require accurate information about real-time arrival/departures of transportation assets. As transit agencies enable large-scale integration of real-time sensors and support back-end data-driven decision support systems, the dynamic data-driven applications systems (DDDAS) paradigm becomes a promising approach to make the system smarter by providing online model learning and multi-time scale analytics as part of the decision support system that is used in the DDDAS feedback loop. In this paper, we describe a system in use in Nashville and illustrate the analytic methods developed by our team. These methods use both historical as well as real-time streaming data for online bus arrival prediction. The historical data is used to build classifiers that enable us to create expected performance models as well as identify anomalies. These classifiers can be used to provide schedule adjustment feedback to the metro transit authority. We also show how these analytics services can be packaged into modular, distributed and resilient micro-services that can be deployed on both cloud back ends as well as edge computing resources.
more »
« less
- PAR ID:
- 10054141
- Date Published:
- Journal Name:
- Cluster Computing
- ISSN:
- 1386-7857
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Real-time decision making has acquired increasing interest as a means to efficiently operating complex systems. The main challenge in achieving real-time decision making is to understand how to develop next generation optimization procedures that can work efficiently using: (i) real data coming from a large complex dynamical system, (ii) simulation models available that reproduce the system dynamics. While this paper focuses on a different problem with respect to the literature in RL, the methods proposed in this paper can be used as a support in a sequential setting as well. The result of this work is the new Generalized Ordinal Learning Framework (GOLF) that utilizes simulated data interpreting them as low accuracy information to be intelligently collected offline and utilized online once the scenario is revealed to the user. GOLF supports real-time decision making on complex dynamical systems once a specific scenario is realized. We show preliminary results of the proposed techniques that motivate the authors in further pursuing the presented ideas.more » « less
-
Online traffic classification enables critical applications such as network intrusion detection and prevention, providing Quality-of-Service, and real-time IoT analytics. However, with increasing network speeds, it has become extremely challenging to analyze and classify traffic online. In this paper, we present Leo, a system for online traffic classification at multi-terabit line rates. At its core, Leo implements an online machine learning (ML) model for traffic classification, namely the decision tree, in the network switch's data plane. Leo's design is fast (can classify packets at switch's line rate), scalable (can automatically select a resource-efficient design for the class of decision tree models a user wants to support), and runtime programmable (the model can be updated on-the-fly without switch downtime), while achieving high model accuracy. We implement Leo on top of Intel Tofino switches. Our evaluations show that Leo is able to classify traffic at line rate with nominal latency overhead, can scale to model sizes more than twice as large as state-of-the-art data plane ML classification systems, while achieving classification accuracy on-par with an offline traffic classifier.more » « less
-
In most process control systems nowadays, process measurements are periodically collected and archived in historians. Analytics applications process the data, and provide results offline or in a time period that is considerably slow in comparison to the performance of many manufacturing processes. Along with the proliferation of Internet-of-Things (IoT) and the introduction of "pervasive sensors" technology in process industries, increasing number of sensors and actuators are installed in process plants for pervasive sensing and control, and the volume of produced process data is growing exponentially. To digest these data and meet the ever-growing requirements to increase production efficiency and improve product quality, there needs a way to both improve the performance of the analytic system and scale the system to closely monitor a much larger set of plant resources. In this paper, we present a real-time data analytics platform, referred to as RT-DAP, to support large-scale continuous data analytics in process industries. RT-DAP is designed to be able to stream, store, process and visualize a large volume of real-time data flows collected from heterogeneous plant resources, and feedback to the control system and operators in a real-time manner. A prototype of the platform is implemented on Microsoft Azure. Our extensive experiments validate the design methodologies of RT-DAP and demonstrate its efficiency in both component and system levels.more » « less
-
Public-transit systems face a number of operational challenges: (a) changing ridership patterns requiring optimization of fixed line services, (b) optimizing vehicle-to-trip assignments to reduce maintenance and operation codes, and (c) ensuring equitable and fair coverage to areas with low ridership. Optimizing these objectives presents a hard computational problem due to the size and complexity of the decision space. State-of-the-art methods formulate these problems as variants of the vehicle routing problem and use data-driven heuristics for optimizing the procedures. However, the evaluation and training of these algorithms require large datasets that provide realistic coverage of various operational uncertainties. This paper presents a dynamic simulation platform, called Transit-Gym, that can bridge this gap by providing the ability to simulate scenarios, focusing on variation of demand models, variations of route networks, and variations of vehicle-to-trip assignments. The central contribution of this work is a domain-specific language and associated experimentation tool-chain and infrastructure to enable subject-matter experts to intuitively specify, simulate, and analyze large-scale transit scenarios and their parametric variations. Of particular significance is an integrated microscopic energy consumption model that also helps to analyze the energy cost of various transit decisions made by the transportation agency of a city.more » « less
An official website of the United States government

