skip to main content


Title: AD 2 : Improving Quality of IoT Data through Compressive Anomaly Detection
With recent technological advances in sensor nodes, IoT enabled applications have great potential in many domains. However, sensing data may be inaccurate due to not only faults or failures in the sensor and network but also the limited resources and transmission capability available in sensor nodes. In this paper, we first model streams of IoT data as a handful of sampled data in the transformed domain while assuming the information attained by those sampled data reveal different sparsity profiles between normal and abnormal. We then present a novel approach called AD2 (Anomaly Detection using Approximated Data) that applies a transformation on the original data, samples top k-dominant components, and detects data anomalies based on the disparity in k values. To demonstrate the effectiveness of AD2 , we use IoT datasets (temperature, humidity, and CO) collected from real-world wireless sensor nodes. Our experimental evaluation demonstrates that AD2 can approximate and successfully detect 64%-94% of anomalies using only 1.9% of the original data and minimize false positive rates, which would otherwise require the entire dataset to achieve the same level of accuracy.  more » « less
Award ID(s):
1751143
NSF-PAR ID:
10136594
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
IEEE International Conference on Big Data (Big Data)
Page Range / eLocation ID:
1662 to 1668
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Edge devices with attentive sensors enable various intelligent services by exploring streams of sensor data. However, anomalies, which are inevitable due to faults or failures in the sensor and network, can result in incorrect or unwanted operational decisions. While promptly ensuring the accuracy of IoT data is critical, lack of labels for live sensor data and limited storage resources necessitates efficient and reliable detection of anomalies at edge nodes. Motivated by the existence of unique sparsity profiles that express original signals as a combination of a few coefficients between normal and abnormal sensing periods, we propose a novel anomaly detection approach, called ADSP (Anomaly Detection with Sparsity Profile). The key idea is to apply a transformation on the raw data, identify top-K dominant components that represent normal data behaviors, and detect data anomalies based on the disparity from K values approximating the periods of normal data in an unsupervised manner. Our evaluation using a set of synthetic datasets demonstrates that ADSP can achieve 92%–100% of detection accuracy. To validate our anomaly detection approach on real-world cases, we label potential anomalies using a range of error boundary conditions using sensors exhibiting a straight line in Q-Q plot and strong Pearson correlation and conduct a controlled comparison of the detection accuracy. Our experimental evaluation using real-world datasets demonstrates that ADSP can detect 83%– 92% of anomalies using only 1.7% of the original data, which is comparable to the accuracy achieved by using the entire datasets. 
    more » « less
  2. Discovery and classification of motifs (repeated patterns) and discords (anomalies) in time series is fundamental to many scientific fields. These and related problems have effectively been solved for offline analysis of time series; however, these approaches are computationally intensive and do not lend themselves to streaming time series, such as those produced by IoT sensors, where the sampling rate imposes real-time constraints on computation and there is strong desire to locate computation as close as possible to the sensor. One promising solution is to use low-cost machine learning models to provide approximate answers to these problems. For example, prior work has trained models to predict the similarity of the most recently sampled window of data points to the time series used for training. This work addresses a more challenging problem, which is to predict not only the “strength” of the match, but also the relative location in the representative time series where the strongest matching subsequences occur. We evaluate our approach on two different real world datasets; we demonstrate speedups as high as about 30x compared to exact computations, with predictive accuracy as high as 87.95%, depending on the granularity of the prediction. 
    more » « less
  3. null (Ed.)
    There is an increasing demand for performing machine learning tasks, such as human activity recognition (HAR) on emerging ultra-low-power internet of things (IoT) platforms. Recent works show substantial efficiency boosts from performing inference tasks directly on the IoT nodes rather than merely transmitting raw sensor data. However, the computation and power demands of deep neural network (DNN) based inference pose significant challenges when executed on the nodes of an energy-harvesting wireless sensor network (EH-WSN). Moreover, managing inferences requiring responses from multiple energy-harvesting nodes imposes challenges at the system level in addition to the constraints at each node. This paper presents a novel scheduling policy along with an adaptive ensemble learner to efficiently perform HAR on a distributed energy-harvesting body area network. Our proposed policy, Origin, strategically ensures efficient and accurate individual inference execution at each sensor node by using a novel activity-aware scheduling approach. It also leverages the continuous nature of human activity when coordinating and aggregating results from all the sensor nodes to improve final classification accuracy. Further, Origin proposes an adaptive ensemble learner to personalize the optimizations based on each individual user. Experimental results using two different HAR data-sets show Origin, while running on harvested energy, to be at least 2.5% more accurate than a classical battery-powered energy aware HAR classifier continuously operating at the same average power. 
    more » « less
  4. Sensory IoT (Internet of Things) networks are widely applied and studied in recent years and have demonstrated their unique benefits in various areas. In this paper, we bring the sensor network to an application scenario that has rarely been studied - the academic cleanrooms. We design SENSELET++, a low-cost IoT sensing platform that can collect, manage and analyze a large amount of sensory data from heterogeneous sensors. Furthermore, we design a novel hybrid anomaly detection framework which can detect both time-critical and complex non-critical anomalies. We validate SENSELET++ through the deployment of the sensing platform in a lithography cleanroom. Our results show the scalability, flexibility, and reliability properties of the system design. Also, using real-world sensory data collected by SENSELET++, our system can analyze data streams in real-time and detect shape and trend anomalies with a 91% true positive rate. 
    more » « less
  5. With the expansion of sensor nodes to newer avenues of technologies, such as the Internet of things (IoT), internet of bodies (IoB), augmented reality (AR), and mixed reality, the demand to support high-speed operations, such as audio and video, with a minimal increase in power consumption is gaining much traction. In this work, we focus on these nodes operating in audio-based AR (AAR) and explore the opportunity of supporting audio at a low power budget. For sensor nodes, communicating one bit of data usually consumes significantly higher power than the power associated with sensing and processing/computing one data bit. Compressing the number of communication bits at the expense of a few computation cycles considerably reduces the overall power consumption of the nodes. Audio codecs such as AAC and LDAC that currently perform compression and decompression of audio streams burn significant power and create a floor to the minimum power possible in these applications. Compressive sensing (CS), a powerful mathematical tool for compression, is often used in physiological signal sensing, such as EEG and ECG, and it can offer a promising low-power alternative to audio codecs. We introduce a new paradigm of using the CS-based approach to realize audio compression that can function as a new independent technique or augment the existing codecs for a higher level of compression. This work, CS-Audio, fabricated in TSMC 65-nm CMOS technology, presents the first CS-based compression, equipped with an ON-chip DWT sparsifier for non-sparse audio signals. The CS design, realized in a pipelined architecture, achieves high data rates and enables a wake-up implementation to bypass computation for insignificant input samples, reducing the power consumption of the hardware. The measurement results demonstrate a 3X-15X reduction in transmitted audio data without a perceivable degradation of audio quality, as indicated by the perceptual evaluation of audio quality mean opinion score (PEAQ MOS) >1.5. The hardware consumes 238 μW power at 0.65 V and 15 Mbps, which is (~20X-40X) lower than audio codecs. 
    more » « less