skip to main content


This content will become publicly available on September 1, 2024

Title: Examining data imbalance in crowdsourced reports for improving flash flood situational awareness
The use of crowdsourced data has been finding practical use for enhancing situational awareness during disasters. While recent studies have shown promising results regarding the potential of crowdsourced data (such as user-generated flood reports) for flash flood mapping and situational awareness, little attention has been paid to data imbalance issues that could introduce biases in data and assessment. To address this gap, in this study, we examine biases present in crowdsourced reports to identify data imbalance with a goal of improving disaster situational awareness. Three biases are examined: sample bias, spatial bias, and demographic bias. To examine these biases, we analyzed reported flooding from 3-1-1 reports (which is a citizen hotline allowing the community to report problems such as flooding) and Waze reports (which is a GPS navigation app that allows drivers to report flooded roads) with respect to FEMA damage data collected in the aftermaths of Tropical Storm Imelda in Harris County, Texas, in 2019 and Hurricane Ida in New York City in 2021. First, sample bias is assessed by expanding the flood-related categories in 3-1-1 reports. Integrating other flooding related topics into the Global Moran's I and Local Indicator of Spatial Association (LISA) revealed more communities that were impacted by floods. To examine spatial bias, we perform the LISA and BI-LISA tests on the data sets—FEMA damage, 3-1-1 reports, and Waze reports—at the census tract level and census block group level. By looking at two geographical aggregations, we found that the larger spatial aggregations, census tracts, show less data imbalance in the results. Through a regression analysis, we found that 3-1-1 reports and Waze reports have data imbalance limitations in areas where minority populations and single parent households reside. The findings of this study advance understanding of data imbalance and biases in crowdsourced datasets that are growingly used for disaster situational awareness. Through addressing data imbalance issues, researchers and practitioners can proactively mitigate biases in crowdsourced data and prevent biased and inequitable decisions and actions.  more » « less
Award ID(s):
1832662
NSF-PAR ID:
10481377
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Elsevier
Date Published:
Journal Name:
International Journal of Disaster Risk Reduction
Volume:
95
Issue:
C
ISSN:
2212-4209
Page Range / eLocation ID:
103825
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Urban flooding is a major natural disaster that poses a serious threat to the urban environment. It is highly demanded that the flood extent can be mapped in near real-time for disaster rescue and relief missions, reconstruction efforts, and financial loss evaluation. Many efforts have been taken to identify the flooding zones with remote sensing data and image processing techniques. Unfortunately, the near real-time production of accurate flood maps over impacted urban areas has not been well investigated due to three major issues. (1) Satellite imagery with high spatial resolution over urban areas usually has nonhomogeneous background due to different types of objects such as buildings, moving vehicles, and road networks. As such, classical machine learning approaches hardly can model the spatial relationship between sample pixels in the flooding area. (2) Handcrafted features associated with the data are usually required as input for conventional flood mapping models, which may not be able to fully utilize the underlying patterns of a large number of available data. (3) High-resolution optical imagery often has varied pixel digital numbers (DNs) for the same ground objects as a result of highly inconsistent illumination conditions during a flood. Accordingly, traditional methods of flood mapping have major limitations in generalization based on testing data. To address the aforementioned issues in urban flood mapping, we developed a patch similarity convolutional neural network (PSNet) using satellite multispectral surface reflectance imagery before and after flooding with a spatial resolution of 3 meters. We used spectral reflectance instead of raw pixel DNs so that the influence of inconsistent illumination caused by varied weather conditions at the time of data collection can be greatly reduced. Such consistent spectral reflectance data also enhance the generalization capability of the proposed model. Experiments on the high resolution imagery before and after the urban flooding events (i.e., the 2017 Hurricane Harvey and the 2018 Hurricane Florence) showed that the developed PSNet can produce urban flood maps with consistently high precision, recall, F1 score, and overall accuracy compared with baseline classification models including support vector machine, decision tree, random forest, and AdaBoost, which were often poor in either precision or recall. The study paves the way to fuse bi-temporal remote sensing images for near real-time precision damage mapping associated with other types of natural hazards (e.g., wildfires and earthquakes). 
    more » « less
  2. Abstract

    Flood nowcasting refers to near-future prediction of flood status as an extreme weather event unfolds to enhance situational awareness. The objective of this study was to adopt and test a novel structured deep-learning model for urban flood nowcasting by integrating physics-based and human-sensed features. We present a new computational modeling framework including an attention-based spatial–temporal graph convolution network (ASTGCN) model and different streams of data that are collected in real-time, preprocessed, and fed into the model to consider spatial and temporal information and dependencies that improve flood nowcasting. The novelty of the computational modeling framework is threefold: first, the model is capable of considering spatial and temporal dependencies in inundation propagation thanks to the spatial and temporal graph convolutional modules; second, it enables capturing the influence of heterogeneous temporal data streams that can signal flooding status, including physics-based features (e.g., rainfall intensity and water elevation) and human-sensed data (e.g., residents’ flood reports and fluctuations of human activity) on flood nowcasting. Third, its attention mechanism enables the model to direct its focus to the most influential features that vary dynamically and influence the flood nowcasting. We show the application of the modeling framework in the context of Harris County, Texas, as the study area and 2017 Hurricane Harvey as the flood event. Three categories of features are used for nowcasting the extent of flood inundation in different census tracts: (i) static features that capture spatial characteristics of various locations and influence their flood status similarity, (ii) physics-based dynamic features that capture changes in hydrodynamic variables, and (iii) heterogeneous human-sensed dynamic features that capture various aspects of residents’ activities that can provide information regarding flood status. Results indicate that the ASTGCN model provides superior performance for nowcasting of urban flood inundation at the census-tract level, with precision 0.808 and recall 0.891, which shows the model performs better compared with other state-of-the-art models. Moreover, ASTGCN model performance improves when heterogeneous dynamic features are added into the model that solely relies on physics-based features, which demonstrates the promise of using heterogenous human-sensed data for flood nowcasting. Given the results of the comparisons of the models, the proposed modeling framework has the potential to be more investigated when more data of historical events are available in order to develop a predictive tool to provide community responders with an enhanced prediction of the flood inundation during urban flood.

     
    more » « less
  3. Abstract

    Smart resilience is the beneficial result of the collision course of the fields of data science and urban resilience to flooding. The objective of this study is to propose and demonstrate a smart flood resilience framework that leverages heterogeneous community-scale big data and infrastructure sensor data to enhance predictive risk monitoring and situational awareness. The smart flood resilience framework focuses on four core capabilities that could be augmented by the use of heterogeneous community-scale big data and analytics techniques: (1) predictive flood risk mapping; (2) automated rapid impact assessment; (3) predictive infrastructure failure prediction and monitoring; and (4) smart situational awareness capabilities. We demonstrate the components of these core capabilities of the smart flood resilience framework in the context of the 2017 Hurricane Harvey in Harris County, Texas. First, we present the use of flood sensors for the prediction of floodwater overflow in channel networks and inundation of co-located road networks. Second, we discuss the use of social media and machine learning techniques for assessing the impacts of floods on communities and sensing emotion signals to examine societal impacts. Third, we describe the use of high-resolution traffic data in network-theoretic models for nowcasting of flood propagation on road networks and the disrupted access to critical facilities, such as hospitals. Fourth, we introduce how location-based and credit card transaction data were used in spatial analyses to proactively evaluate the recovery of communities and the impacts of floods on businesses. These analyses show that the significance of core capabilities of the smart flood resilience framework in helping emergency managers, city planners, public officials, responders, and volunteers to better cope with the impacts of catastrophic flooding events.

     
    more » « less
  4. The 2021 Social Science Extreme Events Research (SSEER) Census summarizes the results of responses gathered from 1,396 social scientists who responded to the SSEER survey between its release date on July 8, 2018 and December 31, 2021. This report characterizes the diversity, disciplinary skills, and expertise within the research community. It is organized into the following categories: (1) number of researchers; (2) researcher geographic location; (3) disciplinary background and expertise; (4) educational and professional background; (5) level of involvement in hazards and disaster research (core, periodic, situational, emerging); (6) research methods and approaches; (7) disaster types, phases, number of extreme events studied, and names of specific extreme events studied; and (8) researcher demographic characteristics. The document concludes with further readings, data citations, and a brief description of the SSEER network. This annual report responds to longstanding calls to better characterize the composition of the hazards and disaster workforce. The 2018-2021 SSEER Census reports are available for download as color and black & white PDF files at: https://converge.colorado.edu/research-networks/sseer/sseer-census/. Social scientists who study hazards and disasters can become part of this network and annual count by joining SSEER at: https://converge.colorado.edu/research-networks/sseer/. More information on SSEER and the other National Science Foundation-funded reconnaissance and research networks is available on the CONVERGE website at: https://converge.colorado.edu/research-networks/.This project includes a survey instrument, data, and annual census reports from the National Science Foundation (NSF)-funded Social Science Extreme Events Research (SSEER) network, which is headquartered at the Natural Hazards Engineering Research Infrastructure (NHERI) CONVERGE facility at the Natural Hazards Center at the University of Colorado Boulder. The SSEER network, which was launched in 2018, was formed, in part, to respond to the need for more specific information about the status and expertise of the social science hazards and disaster research workforce. The mission of SSEER is to identify and map social scientists involved in hazards and disaster research in order to highlight their expertise and connect social science researchers to one another, to interdisciplinary teams, and to communities at risk to hazards and affected by disasters. Ultimately, the goals of SSEER are to amplify the contributions of social scientists and to advance the field through expanding the available social science evidence base. To see the SSEER map and to learn more about the SSEER initiative, please visit: https://converge.colorado.edu/research-networks/sseer. All social and behavioral scientists and those in allied disciplines who study the human, economic, policy, and health dimensions of disasters are invited to join this network via a short online survey. This DesignSafe project includes: (1) the SSEER survey instrument; (2) de-identified data, which is updated annually as new researchers join the SSEER network and returning members update their information; and (3) SSEER annual census reports. These resources are available to all who are interested in learning more about the composition of the social science hazards and disaster workforce. SSEER is part of a larger ecosystem of NSF-funded extreme events research and reconnaissance networks designed to help coordinate disciplinary communities in engineering and the sciences, while also encouraging cross-disciplinary information sharing and interdisciplinary integration. To learn more about the networks and research ecosystem, please visit: https://converge.colorado.edu/research-networks/. 
    more » « less
  5. The 2019 Social Science Extreme Events Research (SSEER) Census summarizes the results of responses gathered from 949 social scientists who filled out the SSEER survey between its release date on July 8, 2018 and December 31, 2019. This report characterizes the diversity and wide range of disciplinary skills and expertise among the research community. It is organized into the following categories: (1) researcher geographic location; (2) disciplinary background and expertise; (3) educational and professional background; (4) level of involvement in hazards and disaster research (core, periodic, situational, emerging); (5) research methods and approaches; (6) disaster types, phases, and specific extreme events studied; and (7) researcher demographic characteristics. The document concludes with further readings, data citations, and a brief description of the SSEER network. This annual report responds to longstanding calls to better characterize the composition of the hazards and disaster workforce. The 2018 and 2019 SSEER Census reports are available for download via PDF and also online at: https://converge.colorado.edu/research-networks/sseer/sseer-census/. Social scientists who study hazards and disasters can become a part of this network and annual count by joining SSEER at: https://converge.colorado.edu/research-networks/sseer/. More information on SSEER and the other National Science Foundation-funded reconnaissance and research networks is available on the CONVERGE website at: https://converge.colorado.edu/research-networks/.This project includes a survey instrument, data, and annual census reports from the National Science Foundation (NSF)-funded Social Science Extreme Events Research (SSEER) network, which is headquartered at the Natural Hazards Engineering Research Infrastructure (NHERI) CONVERGE facility at the Natural Hazards Center at the University of Colorado Boulder. The SSEER network, which was launched in 2018, was formed, in part, to respond to the need for more specific information about the status and expertise of the social science hazards and disaster research workforce. The mission of SSEER is to identify and map social scientists involved in hazards and disaster research in order to highlight their expertise and connect social science researchers to one another, to interdisciplinary teams, and to communities at risk to hazards and affected by disasters. Ultimately, the goals of SSEER are to amplify the contributions of social scientists and to advance the field through expanding the available social science evidence base. To see the SSEER map and to learn more about the SSEER initiative, please visit: https://converge.colorado.edu/research-networks/sseer. All social and behavioral scientists and those in allied disciplines who study the human, economic, policy, and health dimensions of disasters are invited to join this network via a short online survey. This DesignSafe project includes: (1) the SSEER survey instrument; (2) de-identified data, which is updated annually as new researchers join the SSEER network and returning members update their information; and (3) SSEER annual census reports. These resources are available to all who are interested in learning more about the composition of the social science hazards and disaster workforce. SSEER is part of a larger ecosystem of NSF-funded extreme events research and reconnaissance networks designed to help coordinate disciplinary communities in engineering and the sciences, while also encouraging cross-disciplinary information sharing and interdisciplinary integration. To learn more about the networks and research ecosystem, please visit: https://converge.colorado.edu/research-networks/. 
    more » « less