skip to main content


Title: Assessing Trustworthiness of Crowdsourced Flood Incident Reports Using Waze Data: A Norfolk, Virginia Case Study
Climate change and sea-level rise are increasingly leading to higher and prolonged high tides, which, in combination with the growing intensity of rainfall and storm surges, and insufficient drainage infrastructure, result in frequent recurrent flooding in coastal cities. There is a pressing need to understand the occurrence of roadway flooding incidents in order to enact appropriate mitigation measures. Agency data for roadway flooding events are scarce and resource-intensive to collect. Crowdsourced data can provide a low-cost alternative for mapping roadway flood incidents in real time; however, the reliability is questionable. This research demonstrates a framework for asserting trustworthiness on crowdsourced flood incident data in a case study of Norfolk, Virginia. Publicly available (but spatially limited) flood incident data from the city in combination with different environmental and topographical factors are used to create a logistic regression model to predict the probability of roadway flooding at any location on the roadway network. The prediction accuracy of the model was found to be 90.5%. When applying this model to crowdsourced Waze flood incident data, 71.7% of the reports were predicted to be trustworthy. This study demonstrates the potential for using Waze incident report data for roadway flooding detection, providing a framework for cities to identify trustworthy reports in real time to enable rapid situation assessment and mitigation to reduce incident impact.  more » « less
Award ID(s):
1735587
NSF-PAR ID:
10291591
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Transportation Research Record: Journal of the Transportation Research Board
ISSN:
0361-1981
Page Range / eLocation ID:
036119812110312
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The number of emergencies have increased over the years with the growth in urbanization. This pattern has overwhelmed the emergency services with limited resources and demands the optimization of response processes. It is partly due to traditional ‘reactive’ approach of emergency services to collect data about incidents, where a source initiates a call to the emergency number (e.g., 911 in U.S.), delaying and limiting the potentially optimal response. Crowdsourcing platforms such as Waze provides an opportunity to develop a rapid, ‘proactive’ approach to collect data about incidents through crowd-generated observational reports. However, the reliability of reporting sources and spatio-temporal uncertainty of the reported incidents challenge the design of such a proactive approach. Thus, this paper presents a novel method for emergency incident detection using noisy crowdsourced Waze data. We propose a principled computational framework based on Bayesian theory to model the uncertainty in the reliability of crowd-generated reports and their integration across space and time to detect incidents. Extensive experiments using data collected from Waze and the official reported incidents in Nashville, Tenessee in the U.S. show our method can outperform strong baselines for both Fl-score and AUC. The application of this work provides an extensible framework to incorporate different noisy data sources for proactive incident detection to improve and optimize emergency response operations in our communities. 
    more » « less
  2. Abstract

    Flood nowcasting refers to near-future prediction of flood status as an extreme weather event unfolds to enhance situational awareness. The objective of this study was to adopt and test a novel structured deep-learning model for urban flood nowcasting by integrating physics-based and human-sensed features. We present a new computational modeling framework including an attention-based spatial–temporal graph convolution network (ASTGCN) model and different streams of data that are collected in real-time, preprocessed, and fed into the model to consider spatial and temporal information and dependencies that improve flood nowcasting. The novelty of the computational modeling framework is threefold: first, the model is capable of considering spatial and temporal dependencies in inundation propagation thanks to the spatial and temporal graph convolutional modules; second, it enables capturing the influence of heterogeneous temporal data streams that can signal flooding status, including physics-based features (e.g., rainfall intensity and water elevation) and human-sensed data (e.g., residents’ flood reports and fluctuations of human activity) on flood nowcasting. Third, its attention mechanism enables the model to direct its focus to the most influential features that vary dynamically and influence the flood nowcasting. We show the application of the modeling framework in the context of Harris County, Texas, as the study area and 2017 Hurricane Harvey as the flood event. Three categories of features are used for nowcasting the extent of flood inundation in different census tracts: (i) static features that capture spatial characteristics of various locations and influence their flood status similarity, (ii) physics-based dynamic features that capture changes in hydrodynamic variables, and (iii) heterogeneous human-sensed dynamic features that capture various aspects of residents’ activities that can provide information regarding flood status. Results indicate that the ASTGCN model provides superior performance for nowcasting of urban flood inundation at the census-tract level, with precision 0.808 and recall 0.891, which shows the model performs better compared with other state-of-the-art models. Moreover, ASTGCN model performance improves when heterogeneous dynamic features are added into the model that solely relies on physics-based features, which demonstrates the promise of using heterogenous human-sensed data for flood nowcasting. Given the results of the comparisons of the models, the proposed modeling framework has the potential to be more investigated when more data of historical events are available in order to develop a predictive tool to provide community responders with an enhanced prediction of the flood inundation during urban flood.

     
    more » « less
  3. null (Ed.)
    Principled decision making in emergency response management necessitates the use of statistical models that predict the spatial-temporal likelihood of incident occurrence. These statistical models are then used for proactive stationing which allocates first responders across the spatial area in order to reduce overall response time. Traditional methods that simply aggregate past incidents over space and time fail to make useful short-term predictions when the spatial region is large and focused on fine-grained spatial entities like interstate highway networks. This is partially due to the sparsity of incidents with respect to the area in consideration. Further, accidents are affected by several covariates, and collecting, cleaning, and managing multiple streams of data from various sources is challenging for large spatial areas. In this paper, we highlight how this problem is being solved for the state of Tennessee, a state in the USA with a total area of over 100,000 sq. km. Our pipeline, based on a combination of synthetic resampling, non-spatial clustering, and learning from data can efficiently forecast the spatial and temporal dynamics of accident occurrence, even under sparse conditions. In the paper, we describe our pipeline that uses data related to roadway geometry, weather, historical accidents, and real-time traffic congestion to aid accident forecasting. To understand how our forecasting model can affect allocation and dispatch, we improve upon a classical resource allocation approach. Experimental results show that our approach can significantly reduce response times in the field in comparison with current approaches followed by first responders. 
    more » « less
  4. Heavy rainfall leads to severe flooding problems with catastrophic socio-economic impacts worldwide. Hydrologic forecasting models have been applied to provide alerts of extreme flood events and reduce damage, yet they are still subject to many uncertainties due to the complexity of hydrologic processes and errors in forecasted timing and intensity of the floods. This study demonstrates the efficacy of using eXtreme Gradient Boosting (XGBoost) as a state-of-the-art machine learning (ML) model to forecast gauge stage levels at a 5-min interval with various look-out time windows. A flood alert system (FAS) built upon the XGBoost models is evaluated by two historical flooding events for a flood-prone watershed in Houston, Texas. The predicted stage values from the FAS are compared with observed values with demonstrating good performance by statistical metrics (RMSE and KGE). This study further compares the performance from two scenarios with different input data settings of the FAS: (1) using the data from the gauges within the study area only and (2) including the data from additional gauges outside of the study area. The results suggest that models that use the gauge information within the study area only (Scenario 1) are sufficient and advantageous in terms of their accuracy in predicting the arrival times of the floods. One of the benefits of the FAS outlined in this study is that the XGBoost-based FAS can run in a continuous mode to automatically detect floods without requiring an external starting trigger to switch on as usually required by the conventional event-based FAS systems. This paper illustrates a data-driven FAS framework as a prototype that stakeholders can utilize solely based on their gauging information for local flood warning and mitigation practices. 
    more » « less
  5. The use of crowdsourced data has been finding practical use for enhancing situational awareness during disasters. While recent studies have shown promising results regarding the potential of crowdsourced data (such as user-generated flood reports) for flash flood mapping and situational awareness, little attention has been paid to data imbalance issues that could introduce biases in data and assessment. To address this gap, in this study, we examine biases present in crowdsourced reports to identify data imbalance with a goal of improving disaster situational awareness. Three biases are examined: sample bias, spatial bias, and demographic bias. To examine these biases, we analyzed reported flooding from 3-1-1 reports (which is a citizen hotline allowing the community to report problems such as flooding) and Waze reports (which is a GPS navigation app that allows drivers to report flooded roads) with respect to FEMA damage data collected in the aftermaths of Tropical Storm Imelda in Harris County, Texas, in 2019 and Hurricane Ida in New York City in 2021. First, sample bias is assessed by expanding the flood-related categories in 3-1-1 reports. Integrating other flooding related topics into the Global Moran's I and Local Indicator of Spatial Association (LISA) revealed more communities that were impacted by floods. To examine spatial bias, we perform the LISA and BI-LISA tests on the data sets—FEMA damage, 3-1-1 reports, and Waze reports—at the census tract level and census block group level. By looking at two geographical aggregations, we found that the larger spatial aggregations, census tracts, show less data imbalance in the results. Through a regression analysis, we found that 3-1-1 reports and Waze reports have data imbalance limitations in areas where minority populations and single parent households reside. The findings of this study advance understanding of data imbalance and biases in crowdsourced datasets that are growingly used for disaster situational awareness. Through addressing data imbalance issues, researchers and practitioners can proactively mitigate biases in crowdsourced data and prevent biased and inequitable decisions and actions. 
    more » « less