skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Review of Data Analytic Applications in Road Traffic Safety. Part 2: Prescriptive Modeling
In the first part of the review, we observed that there exists a significant gap between the predictive and prescriptive models pertaining to crash risk prediction and minimization, respectively. In this part, we review and categorize the optimization/ prescriptive analytic models that focus on minimizing crash risk. Although the majority of works in this segment of the literature are related to the hazardous materials (hazmat) trucking problems, we show that (with some exceptions) many can also be utilized in non-hazmat scenarios. In an effort to highlight the effect of crash risk prediction model on the accumulated risk obtained from the prescriptive model, we present a simulated example where we utilize four risk indicators (obtained from logistic regression, Poisson regression, XGBoost, and neural network) in the k-shortest path algorithm. From our example, we demonstrate two major designed takeaways: (a) the shortest path may not always result in the lowest crash risk, and (b) a similarity in overall predictive performance may not always translate to similar outcomes from the prescriptive models. Based on the review and example, we highlight several avenues for future research.  more » « less
Award ID(s):
1635927
PAR ID:
10212198
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Sensors
Volume:
20
Issue:
4
ISSN:
1424-8220
Page Range / eLocation ID:
1096
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    This part of the review aims to reduce the start-up burden of data collection and descriptive analytics for statistical modeling and route optimization of risk associated with motor vehicles. From a data-driven bibliometric analysis, we show that the literature is divided into two disparate research streams: (a) predictive or explanatory models that attempt to understand and quantify crash risk based on different driving conditions, and (b) optimization techniques that focus on minimizing crash risk through route/path-selection and rest-break scheduling. Translation of research outcomes between these two streams is limited. To overcome this issue, we present publicly available high-quality data sources (different study designs, outcome variables, and predictor variables) and descriptive analytic techniques (data summarization, visualization, and dimension reduction) that can be used to achieve safer-routing and provide code to facilitate data collection/exploration by practitioners/researchers. Then, we review the statistical and machine learning models used for crash risk modeling. We show that (near) real-time crash risk is rarely considered, which might explain why the optimization models (reviewed in Part 2) have not capitalized on the research outcomes from the first stream. 
    more » « less
  2. Despite ethical and historical arguments for removing race from clinical algorithms, the consequences of removal remain unclear. Here, we highlight a largely undiscussed consideration in this debate: varying data quality of input features across race groups. For example, family history of cancer is an essential predictor in cancer risk prediction algorithms but is less reliably documented for Black participants and may therefore be less predictive of cancer outcomes. Using data from the Southern Community Cohort Study, we assessed whether race adjustments could allow risk prediction models to capture varying data quality by race, focusing on colorectal cancer risk prediction. We analyzed 77,836 adults with no history of colorectal cancer at baseline. The predictive value of self-reported family history was greater for White participants than for Black participants. We compared two cancer risk prediction algorithms—a race-blind algorithm which included standard colorectal cancer risk factors but not race, and a race-adjusted algorithm which additionally included race. Relative to the race-blind algorithm, the race-adjusted algorithm improved predictive performance, as measured by goodness of fit in a likelihood ratio test (P-value: <0.001) and area under the receiving operating characteristic curve among Black participants (P-value: 0.006). Because the race-blind algorithm underpredicted risk for Black participants, the race-adjusted algorithm increased the fraction of Black participants among the predicted high-risk group, potentially increasing access to screening. More broadly, this study shows that race adjustments may be beneficial when the data quality of key predictors in clinical algorithms differs by race group. 
    more » « less
  3. Electric scooters (or e-scooters) are among the most popular micromobility options that have experienced an enormous expansion in urban transportation systems across the world in recent years. Along with the increased usage of e-scooters, the increasing number of e-scooter-related injuries has also become an emerging global public health concern. However, little is known regarding the risk factors for e-scooter-related crashes and injury crashes. This study consisted of a two-phase survey questionnaire administered to a cohort of e-scooter riders (n = 210), which obtained exposure information on riders’ demographics, riding behaviors (including infrastructure selection), helmet use, and other crash-related factors. The risk ratios of riders’ self-reported involvement in an e-scooter-related crash (i.e., any crash versus no crash) and injury crash (i.e., injury crash versus non-injury crash) were estimated across exposure subcategories using the Negative Binomial regression approach. Males and frequent users of e-scooters were associated with an increased risk of e-scooter-related crashes of any type. For the e-scooter-related injury crashes, more frequently riding on bike lanes (i.e., greater than 25% of the time), either protected or unprotected, was identified as a protective factor. E-scooter-related injury crashes were more likely to occur among females, who reported riding on sidewalks and non-paved surfaces more frequently. The study may help inform public policy regarding e-scooter legislation and prioritize efforts to establish suitable road infrastructure for improved e-scooter riding safety. 
    more » « less
  4. IntroductionPredictive models have been used to aid early diagnosis of PCOS, though existing models are based on small sample sizes and limited to fertility clinic populations. We built a predictive model using machine learning algorithms based on an outpatient population at risk for PCOS to predict risk and facilitate earlier diagnosis, particularly among those who meet diagnostic criteria but have not received a diagnosis. MethodsThis is a retrospective cohort study from a SafetyNet hospital’s electronic health records (EHR) from 2003-2016. The study population included 30,601 women aged 18-45 years without concurrent endocrinopathy who had any visit to Boston Medical Center for primary care, obstetrics and gynecology, endocrinology, family medicine, or general internal medicine. Four prediction outcomes were assessed for PCOS. The first outcome was PCOS ICD-9 diagnosis with additional model outcomes of algorithm-defined PCOS. The latter was based on Rotterdam criteria and merging laboratory values, radiographic imaging, and ICD data from the EHR to define irregular menstruation, hyperandrogenism, and polycystic ovarian morphology on ultrasound. ResultsWe developed predictive models using four machine learning methods: logistic regression, supported vector machine, gradient boosted trees, and random forests. Hormone values (follicle-stimulating hormone, luteinizing hormone, estradiol, and sex hormone binding globulin) were combined to create a multilayer perceptron score using a neural network classifier. Prediction of PCOS prior to clinical diagnosis in an out-of-sample test set of patients achieved an average AUC of 85%, 81%, 80%, and 82%, respectively in Models I, II, III and IV. Significant positive predictors of PCOS diagnosis across models included hormone levels and obesity; negative predictors included gravidity and positive bHCG. ConclusionMachine learning algorithms were used to predict PCOS based on a large at-risk population. This approach may guide early detection of PCOS within EHR-interfaced populations to facilitate counseling and interventions that may reduce long-term health consequences. Our model illustrates the potential benefits of an artificial intelligence-enabled provider assistance tool that can be integrated into the EHR to reduce delays in diagnosis. However, model validation in other hospital-based populations is necessary. 
    more » « less
  5. Forecasting models are a central part of many control systems, where high consequence decisions must be made on long latency control variables. These models are particularly relevant for emerging artificial intelligence (AI)-guided instrumentation, in which prescriptive knowledge is needed to guide autonomous decision-making. Here we describe the implementation of a long short-term memory model (LSTM) for forecasting of electron energy loss spectroscopy (EELS) data, one of the richest analytical probes of materials and chemical systems. We describe key considerations for data collection, preprocessing, training, validation, and benchmarking, showing how this approach can yield powerful predictive insight into order-disorder phase transitions. Finally, we comment on how such a model may integrate with emerging AI-guided instrumentation for powerful high-speed experimentation. 
    more » « less