skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Spectral Clustering in Railway Crossing Accidents Analysis
Abstract This study employs graph mining and spectral clustering to analyze patterns in railway crossing accidents, utilizing a comprehensive dataset from the US Department of Transportation. By constructing a graph of implicit relationships between railway companies based on shared accident localities, we apply spectral clustering to identify distinct clusters of companies with similar accident patterns. This offers nuanced insight into the underlying structure of these incidents. Our results indicate that “Highway User Position” and “Equipment Involved” play pivotal roles in accident clustering, while temporal elements like “Date” and “Time” exert a diminished impact. This research not only sheds light on potential accident causation factors but also sets the stage for subsequent predictive safety analyses. It aims to serve as a cornerstone for future studies that aspire to leverage advanced data-driven techniques for improving railway crossing safety protocols.  more » « less
Award ID(s):
2112650
PAR ID:
10591479
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
American Society of Mechanical Engineers
Date Published:
ISBN:
978-0-7918-8777-6
Format(s):
Medium: X
Location:
Columbia, South Carolina, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Expanding on the insights from our initial investigation into railway accident patterns, this paper delves deeper into the predictive capabilities of machine learning to forecast potential accident trends in railway crossings. Focusing on critical factors such as “Highway User Position” and “Equipment Involved,” we integrate Kernel Ridge Regression (KRR) models tailored to distinct clusters, as well as a global model for the entire dataset. These models, trained on historical data, discern patterns and correlations that might elude traditional statistical methods. Our findings are compelling: certain clusters, despite limited data points, showcase remarkably Root Mean Squared Error (RMSE) values between predictions and real data, indicating superior model performance. However, certain clusters hint at potential overfitting, given the disparities between model predictions and actual data. Conversely, clusters with vast datasets underperform compared to the global model, suggesting intricate interactions within the data that might challenge the model’s capabilities. The performance nuances across clusters emphasize the value of specialized, cluster-specific models in capturing the intricacies of each dataset segment. This study underscores the efficacy of KRR in predicting future railway crossing incidents, fostering the implementation of data-driven strategies in public safety. 
    more » « less
  2. Sendiña-Nadal, Irene (Ed.)
    Local graph clustering is an important machine learning task that aims to find a well-connected cluster near a set of seed nodes. Recent results have revealed that incorporating higher order information significantly enhances the results of graph clustering techniques. The majority of existing research in this area focuses on spectral graph theory-based techniques. However, an alternative perspective on local graph clustering arises from using max-flow and min-cut on the objectives, which offer distinctly different guarantees. For instance, a new method called capacity releasing diffusion (CRD) was recently proposed and shown to preserve local structure around the seeds better than spectral methods. The method was also the first local clustering technique that is not subject to the quadratic Cheeger inequality by assuming a good cluster near the seed nodes. In this paper, we propose a local hypergraph clustering technique called hypergraph CRD (HG-CRD) by extending the CRD process to cluster based on higher order patterns, encoded as hyperedges of a hypergraph. Moreover, we theoretically show that HG-CRD gives results about a quantity called motif conductance, rather than a biased version used in previous experiments. Experimental results on synthetic datasets and real world graphs show that HG-CRD enhances the clustering quality. 
    more » « less
  3. ABSTRACT The accepted gap—the time or distance a driver deems sufficient to enter or cross an intersection—is a key indicator of traffic risk, particularly at uncontrolled three‐legged intersections. Smaller accepted gaps are linked to higher risk due to an increased chance of vehicle conflicts. This study investigates the relationship between accepted gaps and risk and proposes a method to quantify the level of risk and severity (LORS) to guide targeted safety interventions. Data on vehicle speed, accepted gap and critical gap were collected from six rural intersections in India. Using a binary logit regression model and clustering techniques, the LORS was estimated and validated against actual accident data, yielding a predictive accuracy of up to 83%. The significance of this study lies in its novel data‐driven approach to safety assessment using parameters easily measured in the field. Designed for heterogeneous traffic conditions, the method provides traffic engineers and planners with a practical tool to assess intersection safety, recommend specific remedial measures and prioritise interventions based on risk and severity levels. With potential for automation and scalability, this research contributes to the development of safer road systems, particularly in low‐resource settings where conventional crash data is limited or unavailable. 
    more » « less
  4. Abstract In recent years, longer and heavier trains have become more common, primarily driven by efficiency and cost‐saving measures in the railroad industry. Regulation of train length is currently under consideration in the United States at both the federal and state levels, because of concerns that longer trains may have a higher risk of derailment, but the relationship between train length and risk of derailment is not yet well understood. In this study, we use data on freight train accidents during the 2013–2022 period from the Federal Railroad Administration (FRA) Rail Equipment Accident and Highway‐Rail Grade Crossing Accident databases to estimate the relationship between freight train length and the risk of derailment. We determine that longer trains do have a greater risk of derailment. Based on our analysis, running 100‐car trains is associated with 1.11 (95% confidence interval: 1.10–1.12) times the derailment odds of running 50‐car trains (or a 11% increase), even accounting for the fact that only half as many 100‐car trains would need to run. For 200‐car trains, the odds increase by 24% (odds ratio 1.24, 95% confidence interval: 1.20–1.28), again accounting for the need for fewer trains. Understanding derailment risk is an important component for evaluating the overall safety of the rail system and for the future development and regulation of freight rail transportation. Given the limitations of the current data on freight train length, this study provides an important step toward such an understanding. 
    more » « less
  5. Abstract From 2013 to 2022, 1671 derailments have been reported by the Federal Railroad Administration (FRA), 8.2% of which were due to journal bearing defects. The University Transportation Center for Railway Safety (UTCRS) designed an onboard monitoring system that tracks vibration waveforms over time to assess bearing health through three analysis levels. However, the speed of the bearing, a fundamental parameter for these analyses, is often acquired from Global Positioning System (GPS) data, which is typically not available at the sensor location. To solve this issue, this paper proposes to employ Machine Learning (ML) algorithms to extract the speed and other essential features from existing vibration data, eliminating the need for additional speed sensors. Specifically, the proposed method tries to extract the speed information from the signatures that are embedded in the Power Spectral Density (PSD) plot, which enables rapid real-time analysis of bearings while the train is in motion. The rapid extraction of data could be sent to a cloud accessible by train dispatchers and railcar owners for assessment of bearings and scheduling of replacements before defects reach a dangerous size. Eventually, the developed algorithm will reduce derailments and unplanned field replacements and afford rail stakeholders more cost-effective preventive maintenance. 
    more » « less