skip to main content

Title: Zero-bias Deep Learning Enabled Quickest Abnormal Event Detection in IoT
Abnormal event detection with the lowest latency is an indispensable function for safety-critical systems, such as cyber defense systems. However, as systems become increasingly complicated, conventional sequential event detection methods become less effective, especially when we need to define indicator metrics from complicated data manually. Although Deep Neural Networks (DNNs) have been used to handle heterogeneous data, the theoretic assurability and explainability are still insufficient. This paper provides a holistic framework for the quickest and sequential detection of abnormalities and time-dependent abnormal events. We explore the latent space characteristics of zero-bias neural networks considering the classification boundaries and abnormalities. We then provide a novel method to convert zero-bias DNN classifiers into performance-assured binary abnormality detectors. Finally, we provide a sequential Quickest Detection (QD) scheme that provides the theoretically assured lowest abnormal event detection delay under false alarm constraints using the converted abnormality detector. We verify the effectiveness of the framework using real massive signal records in aviation communication systems and simulation. Codes and data are available at.
; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
IEEE Internet of Things Journal
Page Range or eLocation-ID:
1 to 1
Sponsoring Org:
National Science Foundation
More Like this
  1. The monitoring of data streams with a network structure have drawn increasing attention due to its wide applications in modern process control. In these applications, high-dimensional sensor nodes are interconnected with an underlying network topology. In such a case, abnormalities occurring to any node may propagate dynamically across the network and cause changes of other nodes over time. Furthermore, high dimensionality of such data significantly increased the cost of resources for data transmission and computation, such that only partial observations can be transmitted or processed in practice. Overall, how to quickly detect abnormalities in such large networks with resource constraints remains a challenge, especially due to the sampling uncertainty under the dynamic anomaly occurrences and network-based patterns. In this paper, we incorporate network structure information into the monitoring and adaptive sampling methodologies for quick anomaly detection in large networks where only partial observations are available. We develop a general monitoring and adaptive sampling method and further extend it to the case with memory constraints, both of which exploit network distance and centrality information for better process monitoring and identification of abnormalities. Theoretical investigations of the proposed methods demonstrate their sampling efficiency on balancing between exploration and exploitation, as well asmore »the detection performance guarantee. Numerical simulations and a case study on power network have demonstrated the superiority of the proposed methods in detecting various types of shifts. Note to Practitioners —Continuous monitoring of networks for anomalous events is critical for a large number of applications involving power networks, computer networks, epidemiological surveillance, social networks, etc. This paper aims at addressing the challenges in monitoring large networks in cases where monitoring resources are limited such that only a subset of nodes in the network is observable. Specifically, we integrate network structure information of nodes for constructing sequential detection methods via effective data augmentation, and for designing adaptive sampling algorithms to observe suspicious nodes that are likely to be abnormal. Then, the method is further generalized to the case that the memory of the computation is also constrained due to the network size. The developed method is greatly beneficial and effective for various anomaly patterns, especially when the initial anomaly randomly occurs to nodes in the network. The proposed methods are demonstrated to be capable of quickly detecting changes in the network and dynamically changes the sampling priority based on online observations in various cases, as shown in the theoretical investigation, simulations and case studies.« less
  2. Abstract Laser-based additive manufacturing (LBAM) provides unrivalled design freedom with the ability to manufacture complicated parts for a wide range of engineering applications. Melt pool is one of the most important signatures in LBAM and is indicative of process anomalies and part defects. High-speed thermal images of the melt pool captured during LBAM make it possible for in situ melt pool monitoring and porosity prediction. This paper aims to broaden current knowledge of the underlying relationship between process and porosity in LBAM and provide new possibilities for efficient and accurate porosity prediction. We present a deep learning-based data fusion method to predict porosity in LBAM parts by leveraging the measured melt pool thermal history and two newly created deep learning neural networks. A PyroNet, based on Convolutional Neural Networks, is developed to correlate in-process pyrometry images with layer-wise porosity; an IRNet, based on Long-term Recurrent Convolutional Networks, is developed to correlate sequential thermal images from an infrared camera with layer-wise porosity. Predictions from PyroNet and IRNet are fused at the decision-level to obtain a more accurate prediction of layer-wise porosity. The model fidelity is validated with LBAM Ti–6Al–4V thin-wall structure. This is the first work that manages to fuse pyrometermore »data and infrared camera data for metal additive manufacturing (AM). The case study results based on benchmark datasets show that our method can achieve high accuracy with relatively high efficiency, demonstrating the applicability of the method for in situ porosity detection in LBAM.« less
  3. Recurrent neural networks (RNNs) are nonlinear dynamical models commonly used in the machine learning and dynamical systems literature to represent complex dynamical or sequential relationships between variables. Recently, as deep learning models have become more common, RNNs have been used to forecast increasingly complicated systems. Dynamical spatio-temporal processes represent a class of complex systems that can potentially benefit from these types of models. Although the RNN literature is expansive and highly developed, uncertainty quantification is often ignored. Even when considered, the uncertainty is generally quantified without the use of a rigorous framework, such as a fully Bayesian setting. Here we attempt to quantify uncertainty in a more formal framework while maintaining the forecast accuracy that makes these models appealing, by presenting a Bayesian RNN model for nonlinear spatio-temporal forecasting. Additionally, we make simple modifications to the basic RNN to help accommodate the unique nature of nonlinear spatio-temporal data. The proposed model is applied to a Lorenz simulation and two real-world nonlinear spatio-temporal forecasting applications.
  4. This paper is devoted to the detection of contingencies in modern power systems. Because the systems we consider are under the framework of cyber-physical systems, it is necessary to take into consideration of the information processing aspect and communication networks. A consequence is that noise and random disturbances are unavoidable. The detection problem then becomes one known as quickest detection. In contrast to running the detection problem in a discretetime setting leading to a sequence of detection problems, this work focuses on the problem in a continuous-time setup. We treat stochastic differential equation models. One of the distinct features is that the systems are hybrid involving both continuous states and discrete events that coexist and interact. The discrete event process is modeled by a continuous-time Markov chain representing random environments that are not resented by a continuous sample path. The quickest detection then can be written as an optimal stopping problem. This paper is devoted to finding numerical solutions to the underlying problem. We use a Markov chain approximation method to construct the numerical algorithms. Numerical examples are used to demonstrate the performance.
  5. Creating a large-scale dataset of abnormality annotation on medical images is a labor-intensive and costly task. Leveraging weak supervision from readily available data such as radiology reports can compensate lack of large-scale data for anomaly detection methods. However, most of the current methods only use image-level pathological observations, failing to utilize the relevant anatomy mentions in reports. Furthermore, Natural Language Processing (NLP)-mined weak labels are noisy due to label sparsity and linguistic ambiguity. We propose an Anatomy-Guided chest X-ray Network (AGXNet) to address these issues of weak annotation. Our framework consists of a cascade of two net- works, one responsible for identifying anatomical abnormalities and the second responsible for pathological observations. The critical component in our framework is an anatomy-guided attention module that aids the downstream observation network in focusing on the relevant anatomical regions generated by the anatomy network. We use Positive Unlabeled (PU) learning to account for the fact that lack of mention does not nec- essarily mean a negative label. Our quantitative and qualitative results on the MIMIC-CXR dataset demonstrate the effectiveness of AGXNet in disease and anatomical abnormality localization. Experiments on the NIH Chest X-ray dataset show that the learned feature representations are transferable and canmore »achieve the state-of-the-art performances in dis- ease classification and competitive disease localization results. Our code is available at« less