skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Anomaly Detection in Power System State Estimation: Review and New Directions
Foundational and state-of-the-art anomaly-detection methods through power system state estimation are reviewed. Traditional components for bad data detection, such as chi-square testing, residual-based methods, and hypothesis testing, are discussed to explain the motivations for recent anomaly-detection methods given the increasing complexity of power grids, energy management systems, and cyber-threats. In particular, state estimation anomaly detection based on data-driven quickest-change detection and artificial intelligence are discussed, and directions for research are suggested with particular emphasis on considerations of the future smart grid.  more » « less
Award ID(s):
1935389
PAR ID:
10477042
Author(s) / Creator(s):
; ;
Publisher / Repository:
Energies
Date Published:
Journal Name:
Energies
Volume:
16
Issue:
18
ISSN:
1996-1073
Page Range / eLocation ID:
6678
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Density estimation is a widely used method to perform unsupervised anomaly detection. By learning the density function, data points with relatively low densities are classified as anomalies. Unfortunately, the presence of anomalies in training data may significantly impact the density estimation process, thereby imposing significant challenges to the use of more sophisticated density estimation methods such as those based on deep neural networks. In this work, we propose RobustRealNVP, a deep density estimation framework that enhances the robustness of flow-based density estimation methods, enabling their application to unsupervised anomaly detection. RobustRealNVP differs from existing flow-based models from two perspectives. First, RobustRealNVP discards data points with low estimated densities during optimization to prevent them from corrupting the density estimation process. Furthermore, it imposes Lipschitz regularization to the flow-based model to enforce smoothness in the estimated density function. We demonstrate the robustness of our algorithm against anomalies in training data from both theoretical and empirical perspectives. The results show that our algorithm achieves competitive results as compared to state-of-the-art unsupervised anomaly detection methods. 
    more » « less
  2. Falsified data from compromised Phasor Measurement Units (PMUs) in a smart grid induce Energy Management Systems (EMS) to have an inaccurate estimation of the state of the grid, disrupting various operations of the power grid. Moreover, the PMUs deployed at the distribution layer of a smart grid show dynamic fluctuations in their data streams, which make it extremely challenging to design effective learning frameworks for anomaly based attack detection. In this paper, we propose a noise resilient learning framework for anomaly based attack detection specifically for distribution layer PMU infrastructure, that show real time indicators of data falsifications attacks while offsetting the effect of false alarms caused by the noise. Specifically, we propose a feature extraction framework that uses some Pythagorean Means of the active power from a cluster of PMUs, reducing multi-dimensional nature of the PMU data streams via quick big data summarization. We also propose a robust and noise resilient methodology for learning thresholds based on generalized robust estimation theory of our invariant feature. We experimentally validate our approach and demonstrate improved reliability performance using two completely different datasets collected from real distribution level PMU infrastructures. 
    more » « less
  3. Log anomaly detection, critical in identifying system failures and preempting security breaches, finds irregular patterns within large volumes of log data. Modern log anomaly detectors rely on training deep learning models on clean anomaly-free log data. However, such clean log data requires expensive and tedious human labeling. In this paper, we thus propose a robust log anomaly detection framework, PlutoNOSPACE, that automatically selects a clean representative sample subset of the polluted log sequence data to train a Transformer-based anomaly detection model. Pluto features three innovations. First, due to localized concentrations of anomalies inherent in the embedding space of log data, Pluto partitions the sequence embedding space generated by the model into regions that then allow it to identify and discard regions that are highly polluted by our pollution level estimation scheme, based on our pollution quantification via Gaussian mixture modeling. Second, for the remaining more slightly polluted regions, we select samples that maximally purify the eigenvector spectrum, which can be transformed into the NP-hard facility location problem; allowing us to leverage its greedy solution with a (1-(1/e)) approximation guarantee in optimality. Third, by iteratively alternating between the above subset selection, a model re-training on the latest subset, and a subset filtering using dynamic training artifacts generated by the latest model, the data selected is progressively refined. The final sample set is used to retrain the final anomaly detection model. Our experiments on four real-world log benchmark datasets demonstrate that by retaining 77.7% (BGL) to 96.6% (ThunderBird) of the normal sequences while effectively removing 90.3% (BGL) to 100.0% (ThunderBird, HDFS) of the anomalies, Pluto provides a significant absolute F-1 improvement up to 68.86% (2.16% → 71.02%) compared to the state-of-the-art sample selection methods. The implementation of this work is available at https://github.com/LeiMa0324/Pluto-SIGMOD25. 
    more » « less
  4. Log anomaly detection, critical in identifying system failures and preempting security breaches, finds irregular patterns within large volumes of log data. Modern log anomaly detectors rely on training deep learning models on clean anomaly-free log data. However, such clean log data requires expensive and tedious human labeling. In this paper, we thus propose a robust log anomaly detection framework, PlutoNOSPACE, that automatically selects a clean representative sample subset of the polluted log sequence data to train a Transformer-based anomaly detection model. Pluto features three innovations. First, due to localized concentrations of anomalies inherent in the embedding space of log data, Pluto partitions the sequence embedding space generated by the model into regions that then allow it to identify and discard regions that are highly polluted by our pollution level estimation scheme, based on our pollution quantification via Gaussian mixture modeling. Second, for the remaining more slightly polluted regions, we select samples that maximally purify the eigenvector spectrum, which can be transformed into the NP-hard facility location problem; allowing us to leverage its greedy solution with a (1-(1/e)) approximation guarantee in optimality. Third, by iteratively alternating between the above subset selection, a model re-training on the latest subset, and a subset filtering using dynamic training artifacts generated by the latest model, the data selected is progressively refined. The final sample set is used to retrain the final anomaly detection model. Our experiments on four real-world log benchmark datasets demonstrate that by retaining 77.7\% (BGL) to 96.6\% (ThunderBird) of the normal sequences while effectively removing 90.3\% (BGL) to 100.0\% (ThunderBird, HDFS) of the anomalies, Pluto provides a significant absolute F-1 improvement up to 68.86\% (2.16\% → 71.02\%) compared to the state-of-the-art sample selection methods. The implementation of this work is available at https://github.com/LeiMa0324/Pluto-SIGMOD25. 
    more » « less
  5. Modern smart cities need smart transportation solutions to quickly detect various traffic emergencies and incidents in the city to avoid cascading traffic disruptions. To materialize this, roadside units and ambient transportation sensors are being deployed to collect speed data that enables the monitoring of traffic conditions on each road segment. In this paper, we first propose a scalable data-driven anomaly-based traffic incident detection framework for a city-scale smart transportation system. Specifically, we propose an incremental region growing approximation algorithm for optimal Spatio-temporal clustering of road segments and their data; such that road segments are strategically divided into highly correlated clusters. The highly correlated clusters enable identifying a Pythagorean Mean-based invariant as an anomaly detection metric that is highly stable under no incidents but shows a deviation in the presence of incidents. We learn the bounds of the invariants in a robust manner such that anomaly detection can generalize to unseen events, even when learning from real noisy data. Second, using cluster-level detection, we propose a folded Gaussian classifier to pinpoint the particular segment in a cluster where the incident happened in an automated manner. We perform extensive experimental validation using mobility data collected from four cities in Tennessee, compare with the state-of-the-art ML methods, to prove that our method can detect incidents within each cluster in real-time and outperforms known ML methods. 
    more » « less