Attributed networks are a type of graph structured data used in many real-world scenarios. Detecting anomalies on attributed networks has a wide spectrum of applications such as spammer detection and fraud detection. Although this research area draws increasing attention in the last few years, previous works are mostly unsupervised because of expensive costs of labeling ground truth anomalies. Many recent studies have shown different types of anomalies are often mixed together on attributed networks and such invaluable human knowledge could provide complementary insights in advancing anomaly detection on attributed networks. To this end, we study the novel problem of modeling and integrating human knowledge of different anomaly types for attributed network anomaly detection. Specifically, we first model prior human knowledge through a novel data augmentation strategy. We then integrate the modeled knowledge in a Siamese graph neural network encoder through a well-designed contrastive loss. In the end, we train a decoder to reconstruct the original networks from the node representations learned by the encoder, and rank nodes according to its reconstruction error as the anomaly metric. Experiments on five real-world datasets demonstrate that the proposed framework outperforms the state-of-the-art anomaly detection algorithms. 
                        more » 
                        « less   
                    This content will become publicly available on December 2, 2025
                            
                            KnowGraph: Knowledge-Enabled Anomaly Detection via Logical Reasoning on Graph Data
                        
                    
    
            Graph-based anomaly detection is pivotal in diverse security applications, such as fraud detection in transaction networks and intrusion detection for network traffic. Standard approaches, including Graph Neural Networks (GNNs), often struggle to generalize across shifting data distributions. For instance, we observe that a real-world eBay transaction dataset revealed an over 50% decline in fraud detection accuracy when adding data from only a single new day to the graph due to data distribution shifts. This highlights a critical vulnerability in purely data-driven approaches. Meanwhile, real-world domain knowledge, such as "simultaneous transactions in two locations are suspicious," is more stable and a common existing component of real-world detection strategies. To explicitly integrate such knowledge into data-driven models such as GCNs, we propose KnowGraph, which integrates domain knowledge with data-driven learning for enhanced graph-based anomaly detection. KnowGraph comprises two principal components: (1) a statistical learning component that utilizes a main model for the overarching detection task, augmented by multiple specialized knowledge models that predict domain-specific semantic entities; (2) a reasoning component that employs probabilistic graphical models to execute logical inferences based on model outputs, encoding domain knowledge through weighted first-order logic formulas. In addition, KnowGraph has leveraged the Predictability-Computability-Stability (PCS) framework for veridical data science to estimate and mitigate prediction uncertainties. Empirically, KnowGraph has been rigorously evaluated on two significant real-world scenarios: collusion detection in the online marketplace eBay and intrusion detection within enterprise networks. Extensive experiments on these large-scale real-world datasets show that KnowGraph consistently outperforms state-of-the-art baselines in both transductive and inductive settings, achieving substantial gains in average precision when generalizing to completely unseen test graphs. Further ablation studies demonstrate the effectiveness of the proposed reasoning component in improving detection performance, especially under extreme class imbalance. These results highlight the potential of integrating domain knowledge into data-driven models for high-stakes, graph-based security applications. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2229876
- PAR ID:
- 10575576
- Publisher / Repository:
- ACM
- Date Published:
- ISBN:
- 9798400706363
- Page Range / eLocation ID:
- 168 to 182
- Format(s):
- Medium: X
- Location:
- Salt Lake City UT USA
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            DOS-GNN: Dual-Feature Aggregations with Over-Sampling for Class-Imbalanced Fraud Detection On GraphsAs fraudulent activities have shot up manifolds, fraud detection has emerged as a pivotal process in different fields (e.g., e-commerce, online reviews, and social networks). Since interactions among entities provide valuable insights into fraudulent activities, such behaviors can be naturally represented as graph structures, where graph neural networks (GNNs) have been developed as prominent models to boost the efficacy of fraud detection. In graph-based fraud detection, handling imbalanced datasets poses a significant challenge, as the minority class often gets overshadowed, diminishing the performance of conventional GNNs. While oversampling has recently been adapted for imbalanced graphs, it contends with issues such as graph heterophily and noisy edge synthesis. To address these limitations, this paper introduces DOS-GNN, incorporating Dual-feature aggregation with Over-Sampling to advance GNNs for class-imbalanced fraud detection on graphs. This model exploits feature separation and dual-feature aggregation to mitigate the impact of heterophily and acquire refined node embeddings that facilitate fraud oversampling to balance class distribution without the need for edge synthesis. Extensive experiments on four large and real-world fraud datasets demonstrate that DOS-GNN can significantly improve fraud detection performance on graphs with different imbalance ratios and homophily ratios, outperforming state-of-the-art GNN models.more » « less
- 
            Reliable methods for host-layer intrusion detection remained an open problem within computer security. Recent research has recast intrusion detection as a provenance graph anomaly detection problem thanks to concurrent advancements in machine learning and causal graph auditing. While these approaches show promise, their robustness against an adaptive adversary has yet to be proven. In particular, it is unclear if mimicry attacks, which plagued past approaches to host intrusion detection, have a similar effect on modern graph-based methods. In this work, we reveal that systematic design choices have allowed mimicry attacks to continue to abound in provenance graph host intrusion detection systems (Prov-HIDS). Against a corpus of exemplar Prov-HIDS, we develop evasion tactics that allow attackers to hide within benign process behaviors. Evaluating against public datasets, we demonstrate that an attacker can consistently evade detection (100% success rate) without modifying the underlying attack behaviors. We go on to show that our approach is feasible in live attack scenarios and outperforms domain-general adversarial sample techniques. Through open sourcing our code and datasets, this work will serve as a benchmark for the evaluation of future Prov-HIDS.more » « less
- 
            Network Intrusion Detection in Smart Grids for Imbalanced Attack Types Using Machine Learning ModelsSmart grid has evolved as the next generation power grid paradigm which enables the transfer of real time information between the utility company and the consumer via smart meter and advanced metering infrastructure (AMI). These information facilitate many services for both, such as automatic meter reading, demand side management, and time-of-use (TOU) pricing. However, there have been growing security and privacy concerns over smart grid systems, which are built with both smart and legacy information and operational technologies. Intrusion detection is a critical security service for smart grid systems, alerting the system operator for the presence of ongoing attacks. Hence, there has been lots of research conducted on intrusion detection in the past, especially anomaly-based intrusion detection. Problems emerge when common approaches of pattern recognition are used for imbalanced data which represent much more data instances belonging to normal behaviors than to attack ones, and these approaches cause low detection rates for minority classes. In this paper, we study various machine learning models to overcome this drawback by using CIC-IDS2018 dataset [1].more » « less
- 
            null (Ed.)Network anomaly detection aims to find network elements (e.g., nodes, edges, subgraphs) with significantly different behaviors from the vast majority. It has a profound impact in a variety of applications ranging from finance, healthcare to social network analysis. Due to the unbearable labeling cost, existing methods are predominately developed in an unsupervised manner. Nonetheless, the anomalies they identify may turn out to be data noises or uninteresting data instances due to the lack of prior knowledge on the anomalies of interest. Hence, it is critical to investigate and develop few-shot learning for network anomaly detection. In real-world scenarios, few labeled anomalies are also easy to be accessed on similar networks from the same domain as the target network, while most of the existing works omit to leverage them and merely focus on a single network. Taking advantage of this potential, in this work, we tackle the problem of few-shot network anomaly detection by (1) proposing a new family of graph neural networks -- Graph Deviation Networks (GDN) that can leverage a small number of labeled anomalies for enforcing statistically significant deviations between abnormal and normal nodes on a network; (2) equipping the proposed GDN with a new cross- network meta-learning algorithm to realize few-shot network anomaly detection by transferring meta-knowledge from multiple auxiliary networks. Extensive experimental evaluations demonstrate the efficacy of the proposed approach on few-shot or even one-shot network anomaly detection.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
