skip to main content

Title: Water Leakage Detection Using Neural Networks
The primary goal of the project is to leverage recent developments in smart water technologies to detect and reduce water leakages in large water distribution networks with the aid of neural networks. A cost effective non-invasive solution to detect leakages in transmission pipelines is needed by many water utilities as it will lead to significant water savings and reduced pipe breakage frequencies, especially in older infrastructure systems. The eventual goal of the project is to test the ANN model on a real network using field measured pressure and pipe breakage data after tuning and developing the model with simulated data. In this project we propose building a regression model, based on Multi-Layer Perceptron (MLP) algorithm, which is a class of feedforward Artificial Neural Networks (ANNs) to detect the leak locations within a proposed network. The model should be able to learn the structure, i.e. mapping of various leak nodes and sensor nodes in an area, such that it can detect the leak nodes based on the pressure values with significant accuracy.
Authors:
; ; ; ;
Award ID(s):
1919228
Publication Date:
NSF-PAR ID:
10282254
Journal Name:
World Environmental and Water Resources Congress 2021
Page Range or eLocation-ID:
1033 to 1040
Sponsoring Org:
National Science Foundation
More Like this
  1. It is estimated that about 20% of treated drinking water is lost through distribution pipeline leakages in the United States. Pipeline leakage detection is a top priority for water utilities across the globe as leaks increase operational energy consumption and could also develop into potentially catastrophic water main breaks, if left unaddressed. Leakage detection is a laborious task often limited by the financial and human resources that utilities can afford. Many conventional leak detection techniques also only offer a snapshot indication of leakage presence. Furthermore, the reliability of many leakage detection techniques on plastic pipelines that are increasingly preferred for drinking water applications is questionable. As part of a smart water utility framework, this paper proposes and validates a hydraulic model-based technique for detecting and assessing the severity of leakages in buried water pipelines through monitoring of pressure from across the water distribution system (WDS). The envisioned smart water utility framework entails the capabilities to collect water consumption data from a limited number of WDS nodes and pressure data from a limited number of pressure monitoring stations placed across the WDS. A popular benchmark WDS is initially modified by inducing leakages through addition of orifice nodes. The leakage severity ismore »controlled using emitter coefficients of the orifice nodes. WDS pressure data for various sets of demands is subsequently gathered from locations where pressure monitoring stations are to be placed in that modified distribution network. An evolutionary optimization algorithm is subsequently used to predict the emitter coefficients so as to determine the leakage severities based on the hydraulic dependency of the monitored pressure data on various sets of nodal demands. Artificial neural networks (ANNs) are employed to mimic the popular hydraulic solver EPANET 2.2 for high computational efficiency. The goals of this study are to: (1) validate the proof of concept of the proposed modeling approach for detecting and assessing the severity of leakages and (2) evaluate the sensitivity of the prediction accuracy to number of pressure monitoring stations and number of demand nodes at which consumption data is gathered and used. This study offers new value to prioritize pipes for rehabilitation by predicting leakages through a hydraulic model-based approach.« less
  2. Water distribution systems (WDSs) face a significant challenge in the form of pipe leaks. Pipe leaks can cause loss of a large amount of treated water, leading to pressure loss, increased energy costs, and contamination risks. Locating pipe leaks has been a constant challenge for water utilities and stakeholders due to the underground location of the pipes. Physical methods to detect leaks are expensive, intrusive, and heavily localized. Computational approaches provide an economical alternative to physical methods. Data-driven machine learning-based computational approaches have garnered growing interest in recent years to address the challenge of detecting pipe leaks in WDSs. While several studies have applied machine learning models for leak detection on single pipes and small test networks, their applicability to the real-world WDSs is unclear. Most of these studies simplify the leak characteristics and ignore modeling and measuring device uncertainties, which makes the scalability of their approaches questionable to real-world WDSs. Our study addresses this issue by devising four study cases that account for the realistic leak characteristics (multiple, multi-size, and randomly located leaks) and incorporating noise in the input data to account for the model- and measuring device- related uncertainties. A machine learning-based approach that uses simulated pressure asmore »input to predict both location and size of leaks is proposed. Two different machine learning models: Multilayer Perceptron (MLP) and Convolutional Neural Network (CNN), are trained and tested for the four study cases, and their performances are compared. The precision and recall results for the L-Town network indicate good accuracies for both the models for all study cases, with CNN generally outperforming MLP.« less
  3. Leakages in water distribution networks (WDNs) are estimated to globally cost 39 billion USD per year and cause water and revenue losses, infrastructure degradation, and other cascading effects. Their impacts can be prevented and mitigated with prompt identification and accurate leak localization. In this work, we propose the leakage identification and localization algorithm (LILA), a pressure-based algorithm for data-driven leakage identification and model-based localization in WDNs. First, LILA identifies potential leakages via semi-supervised linear regression of pairwise sensor pressure data and provides the location of their nearest sensors. Second, LILA locates leaky pipes relying on an initial set of candidate pipes and a simulation-based optimization framework with iterative linear and mixed-integer linear programming. LILA is tested on data from the L-Town network devised for the Battle of Leakage Detection and Isolation Methods. Results show that LILA can identify all leakages included in the data set and locate them within a maximum distance of 374 m from their real location. Abrupt leakages are identified immediately or within 2 h, while more time is required to raise alarms on incipient leakages.
  4. Abstract This work investigates the siphon break phenomenon associated with pipe leakage location. The present study is divided into two parts: (1) an unsteady three-dimensional (3D) computational fluid dynamics (CFD) model is developed to simulate the pressure head (water level) and discharge in the simulated siphon using the volume-of-fluid (VOF) technique under no-leakage condition and (2) using the model developed in the first part we investigated the siphon break phenomenon associated with pipe leakage location. The calculated results of transient water level and discharge rate at the simulated siphon for the no-leakage condition were in good agreement with the experimental measurements. In addition, the velocity, pressure fields, and phase fractions in the siphon pipe were analyzed in depth. The methodology and findings presented show that leakage above the hydraulic grade line and close to the top inverted U section of the siphon pipe ultimately leads to the siphon breakage, which is not the case for a leakage below the hydraulic grade line. It is also concluded that if leakage is above the hydraulic grade line and the leakage position is far away from the upper horizontal section of the siphon pipe, it may not lead to the immediate siphon breakagemore »as ingested air gets removed with siphoning water, allowing it further time to cause complete siphon breakage.« less
  5. Obeid, Iyad ; Picone, Joseph ; Selesnick, Ivan (Ed.)
    The Neural Engineering Data Consortium (NEDC) is developing a large open source database of high-resolution digital pathology images known as the Temple University Digital Pathology Corpus (TUDP) [1]. Our long-term goal is to release one million images. We expect to release the first 100,000 image corpus by December 2020. The data is being acquired at the Department of Pathology at Temple University Hospital (TUH) using a Leica Biosystems Aperio AT2 scanner [2] and consists entirely of clinical pathology images. More information about the data and the project can be found in Shawki et al. [3]. We currently have a National Science Foundation (NSF) planning grant [4] to explore how best the community can leverage this resource. One goal of this poster presentation is to stimulate community-wide discussions about this project and determine how this valuable resource can best meet the needs of the public. The computing infrastructure required to support this database is extensive [5] and includes two HIPAA-secure computer networks, dual petabyte file servers, and Aperio’s eSlide Manager (eSM) software [6]. We currently have digitized over 50,000 slides from 2,846 patients and 2,942 clinical cases. There is an average of 12.4 slides per patient and 10.5 slides per casemore »with one report per case. The data is organized by tissue type as shown below: Filenames: tudp/v1.0.0/svs/gastro/000001/00123456/2015_03_05/0s15_12345/0s15_12345_0a001_00123456_lvl0001_s000.svs tudp/v1.0.0/svs/gastro/000001/00123456/2015_03_05/0s15_12345/0s15_12345_00123456.docx Explanation: tudp: root directory of the corpus v1.0.0: version number of the release svs: the image data type gastro: the type of tissue 000001: six-digit sequence number used to control directory complexity 00123456: 8-digit patient MRN 2015_03_05: the date the specimen was captured 0s15_12345: the clinical case name 0s15_12345_0a001_00123456_lvl0001_s000.svs: the actual image filename consisting of a repeat of the case name, a site code (e.g., 0a001), the type and depth of the cut (e.g., lvl0001) and a token number (e.g., s000) 0s15_12345_00123456.docx: the filename for the corresponding case report We currently recognize fifteen tissue types in the first installment of the corpus. The raw image data is stored in Aperio’s “.svs” format, which is a multi-layered compressed JPEG format [3,7]. Pathology reports containing a summary of how a pathologist interpreted the slide are also provided in a flat text file format. A more complete summary of the demographics of this pilot corpus will be presented at the conference. Another goal of this poster presentation is to share our experiences with the larger community since many of these details have not been adequately documented in scientific publications. There are quite a few obstacles in collecting this data that have slowed down the process and need to be discussed publicly. Our backlog of slides dates back to 1997, meaning there are a lot that need to be sifted through and discarded for peeling or cracking. Additionally, during scanning a slide can get stuck, stalling a scan session for hours, resulting in a significant loss of productivity. Over the past two years, we have accumulated significant experience with how to scan a diverse inventory of slides using the Aperio AT2 high-volume scanner. We have been working closely with the vendor to resolve many problems associated with the use of this scanner for research purposes. This scanning project began in January of 2018 when the scanner was first installed. The scanning process was slow at first since there was a learning curve with how the scanner worked and how to obtain samples from the hospital. From its start date until May of 2019 ~20,000 slides we scanned. In the past 6 months from May to November we have tripled that number and how hold ~60,000 slides in our database. This dramatic increase in productivity was due to additional undergraduate staff members and an emphasis on efficient workflow. The Aperio AT2 scans 400 slides a day, requiring at least eight hours of scan time. The efficiency of these scans can vary greatly. When our team first started, approximately 5% of slides failed the scanning process due to focal point errors. We have been able to reduce that to 1% through a variety of means: (1) best practices regarding daily and monthly recalibrations, (2) tweaking the software such as the tissue finder parameter settings, and (3) experience with how to clean and prep slides so they scan properly. Nevertheless, this is not a completely automated process, making it very difficult to reach our production targets. With a staff of three undergraduate workers spending a total of 30 hours per week, we find it difficult to scan more than 2,000 slides per week using a single scanner (400 slides per night x 5 nights per week). The main limitation in achieving this level of production is the lack of a completely automated scanning process, it takes a couple of hours to sort, clean and load slides. We have streamlined all other aspects of the workflow required to database the scanned slides so that there are no additional bottlenecks. To bridge the gap between hospital operations and research, we are using Aperio’s eSM software. Our goal is to provide pathologists access to high quality digital images of their patients’ slides. eSM is a secure website that holds the images with their metadata labels, patient report, and path to where the image is located on our file server. Although eSM includes significant infrastructure to import slides into the database using barcodes, TUH does not currently support barcode use. Therefore, we manage the data using a mixture of Python scripts and manual import functions available in eSM. The database and associated tools are based on proprietary formats developed by Aperio, making this another important point of community-wide discussion on how best to disseminate such information. Our near-term goal for the TUDP Corpus is to release 100,000 slides by December 2020. We hope to continue data collection over the next decade until we reach one million slides. We are creating two pilot corpora using the first 50,000 slides we have collected. The first corpus consists of 500 slides with a marker stain and another 500 without it. This set was designed to let people debug their basic deep learning processing flow on these high-resolution images. We discuss our preliminary experiments on this corpus and the challenges in processing these high-resolution images using deep learning in [3]. We are able to achieve a mean sensitivity of 99.0% for slides with pen marks, and 98.9% for slides without marks, using a multistage deep learning algorithm. While this dataset was very useful in initial debugging, we are in the midst of creating a new, more challenging pilot corpus using actual tissue samples annotated by experts. The task will be to detect ductal carcinoma (DCIS) or invasive breast cancer tissue. There will be approximately 1,000 images per class in this corpus. Based on the number of features annotated, we can train on a two class problem of DCIS or benign, or increase the difficulty by increasing the classes to include DCIS, benign, stroma, pink tissue, non-neoplastic etc. Those interested in the corpus or in participating in community-wide discussions should join our listserv, nedc_tuh_dpath@googlegroups.com, to be kept informed of the latest developments in this project. You can learn more from our project website: https://www.isip.piconepress.com/projects/nsf_dpath.« less