<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Flood Detection Framework Fusing The Physical Sensing &amp; Social Sensing</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>09/01/2020</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10200895</idno>
					<idno type="doi">10.1109/SMARTCOMP50058.2020.00080</idno>
					<title level='j'>2020 IEEE International Conference on Smart Computing (SMARTCOMP)</title>
<idno></idno>
<biblScope unit="volume"></biblScope>
<biblScope unit="issue"></biblScope>					

					<author>Neha Singh</author><author>Bipendra Basnyat</author><author>Nirmalya Roy</author><author>Aryya Gangopadhyay</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[We investigate the practical challenge of localized flood detection in real smart city environment using the fusion of physical sensor and social sensing models to depict a reliable and accurate flood monitoring and detection framework. Our proposed framework efficiently utilize the physical and social sensing models to provide the flood-related updates to the city officials. We deployed our flood monitoring system in Ellicott City, Maryland, USA and connect it to the social sensing module to perform the flood-related sensor and social data integration and analysis. Our ground-based sensor network model record and performs the predictive data analytic by forecasting the rise in water level (RMSE=0.2) that demonstrates the severity of upcoming flash floods whereas, our social sensing model helps collect and track the flood-related feeds from Twitter. We employ a pre-trained model and inductive transfer learning based approach to classify the flood-related tweets with 90% accuracy in the use of unseen target flood events. Finally our flood detection framework categorizes the flood relevant localized contextual details into more meaningful classes in order to help the emergency services and local authorities for effective decision making.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. INTRODUCTION</head><p>National Weather Services reported 28,826 flash floods events in the United States from October 2007 to October 2015, which resulted in 278 live loss and million-dollar worth crop and property damage <ref type="bibr">[1]</ref>. Monitoring and predicting floods proactively would help significantly towards saving peoples live and minimizing property damage. Current flood management systems are moving towards crisis aware decisionmaking process using conventional artificial intelligence and computational intelligence methods in order to detect early flood event with low false alarm rate. Emergency response service and city authorities can then supply efficient strategies and help in post-disaster situation <ref type="bibr">[8]</ref>. Social sensing models are heavily used in computer modelling, precipitation sensing and communication systems for making the flood warning system more effective and reliable.</p><p>Our work focuses on solving the practical challenge which incorporates the contextual localized details such as real-time damage, situation awareness, and updates related to the event for a specific area. These details are acquired by the local physical sensor deployment and social sensing modules that utilizes This project is funded by NSF 1640625 the power of social media. We deployed a sensor model system utilizing physical sensors related to water flow and water level in Patapsco River near Ellicott City, Maryland, USA along with the Howard County historical data in the same region. The sensor model of this study uses the forecasting method to predict the water level and transmits the warning/trigger to Social sensing model. Social sensing model is well suited for real-time and post-disaster analysis since people begin to share only after they witness the incident. Sensor and social models could utilize their advantages for a more reliable, and accurate model. This fusion model can also help in significantly reduce the false alarms about the flash flood warning and heavy rainfall.</p><p>The need to integrate another data source to sensor data can be explained using the Fig. <ref type="figure">1a</ref>, which shows the sudden rise in water level sensor data for January, 2018, collected from Howard county data. Although, there was no flash flood at that time, but we received the false warnings/faulty sensor data. Instead of only relying on sensor data we corroborated it with social media data and found the snow-melt for the same duration for contextual local details using relevant keyword. Fig. <ref type="figure">1b</ref> helps in explaining the localized contextual detail that might be the reason for this sudden spike in Fig. <ref type="figure">1a</ref> given the circumstances. Social sensing here provides more insightful and actionable information of missing ice fisherman at Patapsco river. This would increase awareness and would help in decreasing the false alarm and people to take the flood warnings more seriously.</p><p>In this work, our framework uses spatial temporal domain fusion with the help of @umbc floodbot which is a social media agent connected to our deployed sensors that bridges the temporal gap between sensor and social data. It can be deployed for validating the flood forecast, obtaining the contextual information by localized flood detection and tracking for extracting the actionable insights for city officials to help with emergency response services. We summarize our research goals and contributions as follows:</p><p>&#8226; We deployed four flood monitoring systems in Ellicott City, MD as shown in Fig. <ref type="figure">2a</ref> for localized flood monitoring, detection and decision making.  &#8226; We propose a cyber-physical system for emergency scenarios through the @umbc floodbot for executing the integration between physical sensing and social sensing utilizing user engagement for social gain and reliable flood detection system. &#8226; We successfully classify the localized flood-related tweets with 76% accuracy by using minimal labels and a transfer learning method (ULMFiT) for actionable classes used by local authorities for emergency response.</p><p>&#8226; We validate our model results for another local flood event using our fusion tweets data with 90% accuracy for real time crucial decision making.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>II. RELATED WORK</head><p>In this section we will discuss about the existing approaches that uses various data forms to predict the floods.</p><p>Flood Prediction from Remote Sensing Wireless Sensor Networks (WSNs) are generally powerful, cost effective, adaptable, extensible and have capabilities of faster data transfer including high resolution information and data processing that enables the further analysis and timely alerts in various disaster management applications <ref type="bibr">[4]</ref>. There are existing research that address the flooding problem using traditional methods such as Hydro-logical models, Mathematical and Probabilistic models, Numerical Weather Predictions, WSNs models along with the recent advancements in Artificial Neural Network <ref type="bibr">[6]</ref> and some computer vision based methods using satellite images data <ref type="bibr">[7]</ref>.</p><p>Flood Prediction from Social Sensing Twitter is one of the most exploited social media for event tracking, detection using the visualization, summarization using various Natural Language Processing techniques. These methods are significantly helpful in improving the situational awareness and better decision making.Traditional general data collection platforms uses crowdsourcing services and manually observe the data to label the disaster related information such as CrisisLex <ref type="bibr">[2]</ref>, CrisisNLP <ref type="bibr">[3]</ref>, AIDR <ref type="bibr">[5]</ref> etc. but does not cater to the location specific needs. We need to efficiently leverage the existing knowledge and perform effectively the intended task by adapting to the new area/domain shown in this recent paper which utilize transfer learning for flood detection using public dataset <ref type="bibr">[21]</ref>. These research also shows the collection and analyzes the impact of text and images using twitter data of flash flood occurred in Ellicott city in July 2016 <ref type="bibr">[12]</ref> and have plans to collect other crisis event dataset to perform the text analysis, sentiment analysis <ref type="bibr">[13]</ref> [14] in order to get the real-time situation awareness and general sentiments of the public towards the event that can be shared with the appropriate people or organizations to help in rescue operations and decision making in emergency response.</p><p>Fusion Of Remote Sensing &amp; Social Sensing This study <ref type="bibr">[16]</ref> shows the use of geo-social media data as a proxy environmental variable and integrates with authoritative rainfall data would results in improvement of the early flood warning system. In this paper <ref type="bibr">[17]</ref> a flood inundation reconstruction model is proposed using three different kind of data such as remote sensing satellite imagery, stream gauge reading and social media tweets data used as a verification tool to locally enhance the flood related details. This work <ref type="bibr">[18]</ref> propose the information based heterogeneous data fusion framework for flood density estimation based on the maximum entropy and the least effort principle. Our work focuses on real-time flood monitoring and prediction using localized sensor and social media data for rich contextual insights in order to solve the localized flood problem which needs personalized attention. We propose to use the locally deployed sensor systems and connected social media data to get the balanced optimal results explained in the next section.  As discussed earlier it is a natural phenomenon that during any crisis social media observes influx but the influx contains more noise than valuable information. The main cause behind this can be attributed to spatial temporal heterogeneity of the study area i.e. "Ellicott City". For example, we noticed with our general social media data collection based on keyword search method, we were getting the tweets from all over the United States (shown in Fig. <ref type="figure">1c</ref>) which are mostly irrelevant (belong to other flood events), spams, repeated, warnings only to the flood events. The technological advance, ease of access and any time availability of "share", "re-tweet", "like" buttons further and muddles social media with not much insight. Thus in this work, we propose and evaluate a solution to solve the problem and present our vision of social and physical sensor amalgamation model shown in Fig. <ref type="figure">3a</ref>.</p><p>In order to find relevant social media contents and include the localized context, we have developed a novel methodology to restrict the inflow of social media postings from unrelated areas. Our model consists of a spatial bounding box that we call Spatial Envelope and a Time domain joiner that allows crosswalk possible between the sensor data and the social media data. We envisage that such spatio-temporal bounding box will limit the social media infiltration from irrelevant geolocations. The time domain crosswalk enables the information extraction and relevancy of the social media content.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Systems Components</head><p>Our system mainly have two kind of sensors, the physical sensor and the social sensor along with their underling architecture to support them. The data gathering and the information flow is depicted in Fig. <ref type="figure">3b</ref> the onsite setup of one of these sensors is depicted in Fig. <ref type="figure">2b</ref>.  2) Data Integration Layer: The Physical sensors sends the real time data to our data integration layer which is an Amazon Web Service Module, captures the reading posted by our physical sensor and prepares them for social media integration. The cloud instance is also provisioned with the tweet account (@umbc floodbot) which receives the sensor data and prepares it for social media posting.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Fig. 4: Social and Physical Model Integration Output</head><p>The integrated data is shown in Fig. <ref type="figure">4</ref> displays the water level and real time image captured the contextual details. Once such data are generated, the back tracking unit or the social senor analytic platform consumes and enriches them. The tracking and analysis of such integrated social feeds will lead us to more informative local social data.</p><p>3) Sensor Data Analysis: The Algorithm 1 shows one of our sensor data analysis module for the rise in water flow level using an LSTM forecasting method using the sensor data. This process takes the water related sensor data as input, process and predict the water level for the specific duration depending the the area conditions. It is also capable of sending the signal to the social media and posting the insightful results on twitter. 4) Social Sensors: Social media data has been extensively researched in various disaster events using primarily geolocation and images which is significantly less in numbers. In this work, we are using transfer learning method for text classification <ref type="bibr">[21]</ref> because text data is widely available but has not been exploited to its full potential. Using text data along with sensors models would be an optimal fusion method for reliable and localized context aware situation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>5) Social Media Integration:</head><p>The Tweet-Bot created and connected by our on-site deployed systems. It reports the important information regarding the flood and water flow in the area along with the current contextual details from the sensors using camera using along with the social media user engagement. For example, Fig. <ref type="figure">4</ref> shows an instance of the tweet posted by our system regarding the water flow over certain amount of time. Once the tweet get attention by the general public on twitter, it can be viewed, shared, re-tweeted repeatedly based on the severity of the flood event. We have plan to track the number of activities related to each notification posted on twitter by our system and perform appropriate analysis to get the contextual information of the localized event by the local users. Some of the important measures we track with social media is "Tweet id", "Tweet text", "time", "impressions", "engagements", "retweets", "replies", "likes", "follows" etc. among others. The system is new but we still collected another flood event tweets in recent times to validate our system performance along with the general flood tweets data collection explained and discussed in the next section in details.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>6) Social Media Text Classification:</head><p>The research <ref type="bibr">[21]</ref> shows some significant scope for solving this specific problem with small amount of labeled data for disaster scenarios. Universal Language Model Fine-tuning (ULMFiT) This method ULMFiT <ref type="bibr">[15]</ref> was introduced by Howard and Ruder which can effectively be applied as a transfer learning method for various NLP task.Some of the examples where researchers have used ULMFiT to solve a specific problem using power of transfer learning are <ref type="bibr">[21]</ref>.</p><p>Social Sensing Algorithm Details The Algorithm 2 is summarizing the social sensing text classification module using transfer learning method that classify the extremely relevant flood tweets with very minimal supervision/labeled dataset explained in Algorithm 2.</p><p>It performs the General-Domain Language Model Pretraining by using a pre-trained Language Model (LM) on WikiText-103 dataset known as AWD-LSTM <ref type="bibr">[19]</ref>. It is an optimized LSTM network architecture containing embedding size of 400, 3-Layers, and 1150 hidden activation per layer, We use the same architecture for Target LM retraining them via back-propagation through time, Fine-Tuning T D and updating weights. Next, we need to perform the Target Task Language Model Fine-Tuning general model fine-tunes according to the target task and adapt to the new domain (target) by learning the target task-specific features of the language. It is done using discriminative fine-tuning and slanted triangular learning rates for fine-tuning the LM. Finally, we perform the Target Task Classification, It has a last softmax layer which ultimately outputs probabilities for each label (related and Unrelated) of the corresponding tweet using Gradual Unfreezing along with some other hyper-parameters. In the next section we show the experimental data analysis using the public sensor ans social public datasets. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IV. EXPERIMENTAL &amp; ANALYSIS RESULTS</head><p>This section includes physical sensor data and social sensing data analysis results and discussion. We begin by presenting our analysis using these two data analysis independently followed by the fused data in order to evaluate the performance of proposed amalgamated model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Physical Sensor Data Analysis</head><p>We have collected similar data set using two sources. One comes from our field deployed sensors (UMBC Flood Bots) within our Spatial Envelop and the second set from a Public Data Set of the vicinity and same downstream river.</p><p>1) UMBC Flood Bots: In order to limit and be able to infuse physical sensor data into social sensing model, we deployed 4 Physical Sensors in the Ellicott City. These nodes are live and transmitting data. The GIS coordinates of these four nodes (shown in Fig. <ref type="figure">2a</ref>) are UMBC Roger Avenue Node (39.2686,-76.8092), UMBC Hamilton Street Node (39.2672,-76.7998), UMBC Cour Avenue Node (39.2685,-76.7993), and UMBC St.Paul Road Node (39.2614,-76.7937). We are still in early phase of our data collection using these deployed sensor and hence further analysis and machine learning exploration are based on public data set we collected from Howard County Government's website.</p><p>2) Public Data Set: This data set is collected from Howard County government website and subsequent discussion are based on the public data set only.</p><p>Data Collection and processing: Data is collected as a temporal data series for Stage Gage Sensors (Battery voltage, Flow volume, Rate of Elevation Change, Stage) and Rain Gage Sensors (Battery voltage, Precipitation accumulation, Precipitation increment). Stage gauge and Rain gauge sensors have some heterogeneity and different sampling frequency. After the appropriate feature extraction such as "Normalized Accumulation" and interpolation to get the consistent data.</p><p>Water Level Forecasting: LSTM takes all other features into consideration and learns the characteristics of the flood pattern and forecast better than other models except one. Fig. ?? shows the water level forecast accuracy for LSTM. We used 2 layers, 50 activation per layer, Mean Absolute Error (MAE) as the loss function, adam optimizer, batch size as 72 and 500 epochs. The RMSE for the LSTM forecasting model is 0.201 which is significantly less than other forecasting models we tried but could not show the results due to limited space.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Social Sensor Data Analysis</head><p>We have collected social media data using keyword search for May 27th, 2019 flood event occurred in Ellicott City (say general tweets). We also have our data collection from our new proposed system (say fusion tweets) for Ellicott City flood event occurred on October, 31st, 2019. We have done the tweet data analysis on both the datasets and acquired insightful results. We validate our new proposed system by the later flood event data towards an effective and efficient system.</p><p>We used almost 400 tweets for the first flood (general tweets) and 150 tweets for second flood (fusion tweets) only to build and evaluate our text classification model. We labeled related flood if it is from or about Ellicott City and label other location flood as unrelated to extract meaningful localized contextual information. We freeze the weights of LSTM layers, train embedding and decoding weights of the model for one epoch with specific parameters of &#946; 1 as 0.9 and &#946; 2 between 0.7 and 0.8, Using Dropout 0.5 to regularize, Learning rate 0.001 and use discriminative fine-tuning, adapt to the Target domain. Finally, we perform the target tweet classification using Gradual Unfreezing (learning rate from 0.0001 to 0.001), Moms parameter as (0.8,0.7) provide train-loss, valid-loss, acccuracy and time taken. After carefully selecting the appropriate hyperparameters and weights we provide the final evaluation results and provide precision, recall, f1 score, support. Accurately classified related flood tweets (T r ) sent for corroboration and authority usage. As we can see in Table <ref type="table">I</ref> after target language model and classifier fine tuning we get the 76% accuracy with general tweets and 90% accuracy for new fusion flood tweets (including contextual tweets using @umbc floodbot). Our proposed system is capable of including more localized contextual details from the social media data and gained 90% classification accuracy whereas, we could only achieve 76% accuracy with general tweets which does not include the local sensor and contextual details. Table II measures the classification evaluation showing Precision, Recall &amp; F-Score for general tweets, fusion tweets and a baseline mode for comparison. It shows that our proposed model using the fusion tweets provide meaningful contextual details with higher accuracy then the baseline morel (Logistic Regression). Our model is able to learn efficiently the complex annotation and relationship even with very less labeled data points. The Table III represents the final piece of our proposed system where successfully classified flood related tweets can be further categorized using these crucial localized information into more meaningful classes such as area-update, queries, help, support, flood-update, casualty etc. The authorities and emergency response services can help many people at once by tracking the major issues with the local people's tweet post in every category. Some challenges we faced during the preliminary analysis of each models and fusion of these models are discussed below.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Challenges with the social media data</head><p>Semi-structure &amp; imbalance Data: Raw social data contains text, images, news feeds, geo-locations (generally very less) etc. that have different structure and volume. Time &amp; Usage Dependence: Social media data follows event trend frequency. Capturing and filtering the crucial data within time is challenging. Massive extraneous tweets: Extremely relevant keyword based search includes significant amount of irrelevant data (warning open/close for other county, state, bot/duplicate message with different timestamp etc.) Annotation dilemma: Although highly irrelevant tweets to our localized flood belong to the neighbor county/states flood related information. It is challenging to label such tweets unrelated and make a strong location specific related flood classification model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Challenges with the sensor data</head><p>Sensor Heterogeneity: Difficult to synchronize multiple sensors and settle for the optimal sampling rate among them while merging. Noisy/faulty sensor data based on its physical condition and other environmental effects. Resource/Location constrained: Fixed physical sensors provide data from a certain location whereas, social media data covers much larger area and provides bigger picture as any event occur.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Challenges with Data Fusion</head><p>In this work, we made some basic assumptions to address the heterogeneity challenge discussed above. Different Time Scale: We propose a spatial envelope with time domain joiner between sensor and social data based on their usage pattern.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>VI. CONCLUSION</head><p>In this work, is develop a framework consisting multiple models that uses different kind of data related to flood disaster, coming from sensor network sources and social media sources in order to build a sensor-social data fusion flood detection system. We propose a novel data fusion framework and shown the data analysis for accurately classifying the localize context rich flood related tweets. We have deployed our sensor system and integrate the connection to social media platform in order to get the direct local contextual feeds from social users. Our model was able to learn the local contextual details efficiently and classified the fusion tweets with 90% accuracy. We have validated and evaluated our framework on flood event occurred in Ellicott City on October, 31st, 2019 along with sensor and social data benefits, actionable details and challenges.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>Authorized licensed use limited to: University of Maryland Baltimore Cty. Downloaded on November 08,2020 at 00:06:45 UTC from IEEE Xplore. Restrictions apply.</p></note>
		</body>
		</text>
</TEI>
