NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

FlatD: Protecting Deep Neural Network Program from Reversing Attacks

Zhang, Jinquan; Wang, Zihao; Wang, Pei; Zhong, Rui; Wu, Dinghao (April 2025, Proceedings of the 47th International Conference on Software Engineering (ICSE 2025), Software Engineering in Practice (SEIP) track)

Free, publicly-accessible full text available April 27, 2026
Clustering-Augmented Fraud Detection on Graphs Using Label-Aware Feature Aggregation

Jing, Shixiong; Chen, Lingwei; Wu, Dinghao (December 2024, The 16th Asian Conference on Machine Learning (Conference Track))

Fraud detection has emerged as a pivotal process in different fields (e.g., e-commerce, social networks). Since interactions among entities provide valuable insights into fraudulent activities, such behaviors can be naturally represented as graphs, where graph neural networks (GNNs) have been developed as prominent models to boost the efficacy of fraud detection. However, the application of GNNs in this domain encounters significant challenges, primarily due to class imbalance and a mixture of homophily and heterophily of fraud graphs. To address these challenges, in this paper, we propose LACA, which implements fraud detection on graphs using Label-Aware feature aggregation to advance GNN training, which is regularized by Clustering Augmented optimization. Specifically, label-aware feature aggregation simplifies adaptive aggregation in homophily-heterophily mixed neighborhoods, preventing gradient domination by legitimate nodes and mitigating class imbalance in message passing. Clustering-augmented optimization provides fine-grained subclass semantics to improve detection performance, and yields additional benefit in addressing class imbalance. Extensive experiments on four fraud datasets demonstrate that LACA can significantly improve fraud detection performance on graphs with different imbalance ratios and homophily ratios, outperforming state-of-the-art GNN models.
more » « less
Free, publicly-accessible full text available December 5, 2025
DeepType: Refining Indirect Call Targets with Strong Multi-layer Type Analysis

Xia, Tianrou; Hu, Hong; Wu, Dinghao (August 2024, Proceedings of the 33rd USENIX Security Symposium)

Full Text Available
DEEPTYPE: Refining Indirect Call Targets with Strong Multi-layer Type Analysis

Xia, Tianrou Xia; Hu, Hong Hu; Wu, Dinghao (August 2024, The USENIX Association)

Indirect calls, while facilitating dynamic execution characteristics in C and C++ programs, impose challenges on precise construction of the control-flow graphs (CFG). This hinders effective program analyses for bug detection (e.g., fuzzing) and program protection (e.g., control-flow integrity). Solutions using data-tracking and type-based analysis are proposed for identifying indirect call targets, but are either time-consuming or imprecise for obtaining the analysis results. Multi-layer type analysis (MLTA), as the state-of-the-art approach, upgrades type-based analysis by leveraging multi-layer type hierarchy, but their solution to dealing with the information flow between multi-layer types introduces false positives. In this paper, we propose strong multi-layer type analysis (SMLTA) and implement the prototype, DEEPTYPE, to further refine indirect call targets. It adopts a robust solution to record and retrieve type information, avoiding information loss and enhancing accuracy. We evaluate DEEPTYPE on Linux kernel, 5 web servers, and 14 user applications. Compared to TypeDive, the prototype of MLTA, DEEPTYPE is able to narrow down the scope of indirect call targets by 43.11% on average across most benchmarks and reduce runtime overhead by 5.45% to 72.95%, which demonstrates the effectiveness, efficiency and applicability of SMLTA.
more » « less
Full Text Available
H^2GNN: Graph Neural Networks with Homophilic and Heterophilic Feature Aggregations

Jing, Shixiong; Chen, Lingwei; Li, Quan; Wu, Dinghao (July 2024, International Conference on Database Systems for Advanced Applications, Springer Nature Singapore)

Graph neural networks (GNNs) rely on the assumption of graph homophily, which, however, does not hold in some real-world scenarios. Graph heterophily compromises them by smoothing node representations and degrading their discrimination capabilities. To address this limitation, we propose H^2GNN, which implements Homophilic and Heterophilic feature aggregations to advance GNNs in graphs with homophily or heterophily. H^2GNN proceeds by combining local feature separation and adaptive message aggregation, where each node separates local features into similar and dissimilar feature vectors, and aggregates similarities and dissimilarities from neighbors based on connection property. This allows both similar and dissimilar features for each node to be effectively preserved and propagated, and thus mitigates the impact of heterophily on graph learning process. As dual feature aggregations introduce extra model complexity, we also offer a simplified implementation of H^2GNN to reduce training time. Extensive experiments on seven benchmark datasets have demonstrated that H^2GNN can significantly improve node classification performance in graphs with different homophily ratios, which outperforms state-of-the-art GNN models.
more » « less
Full Text Available
DOS-GNN: Dual-Feature Aggregations with Over-Sampling for Class-Imbalanced Fraud Detection On Graphs

https://doi.org/10.1109/IJCNN60899.2024.10650494

Jing, Shixiong; Chen, Lingwei; Li, Quan; Wu, Dinghao (June 2024, International Joint Conference on Neural Networks)

As fraudulent activities have shot up manifolds, fraud detection has emerged as a pivotal process in different fields (e.g., e-commerce, online reviews, and social networks). Since interactions among entities provide valuable insights into fraudulent activities, such behaviors can be naturally represented as graph structures, where graph neural networks (GNNs) have been developed as prominent models to boost the efficacy of fraud detection. In graph-based fraud detection, handling imbalanced datasets poses a significant challenge, as the minority class often gets overshadowed, diminishing the performance of conventional GNNs. While oversampling has recently been adapted for imbalanced graphs, it contends with issues such as graph heterophily and noisy edge synthesis. To address these limitations, this paper introduces DOS-GNN, incorporating Dual-feature aggregation with Over-Sampling to advance GNNs for class-imbalanced fraud detection on graphs. This model exploits feature separation and dual-feature aggregation to mitigate the impact of heterophily and acquire refined node embeddings that facilitate fraud oversampling to balance class distribution without the need for edge synthesis. Extensive experiments on four large and real-world fraud datasets demonstrate that DOS-GNN can significantly improve fraud detection performance on graphs with different imbalance ratios and homophily ratios, outperforming state-of-the-art GNN models.
more » « less
Full Text Available
Graph Adversarial Diffusion Convolution

Liu, Songtao; Chen, Jinghui; Fu, Tianfan; Lin, Lu; Zitnik, Marinka; Wu, Dinghao (July 2024, Proceedings of the 41st International Conference on Machine Learning (ICML))

This paper introduces a min-max optimization formulation for the Graph Signal Denoising (GSD) problem. In this formulation, we first maximize the second term of GSD by introducing perturbations to the graph structure based on Laplacian distance and then minimize the overall loss of the GSD. By solving the min-max optimization problem, we derive a new variant of the Graph Diffusion Convolution (GDC) architecture, called Graph Adversarial Diffusion Convolution (GADC). GADC differs from GDC by incorporateing an additional term that enhances robustness against adversarial attacks on the graph structure and noise in node features. Moreover, GADC improves the performance of GDC on heterophilic graphs. Extensive experiments demonstrate the effectiveness of GADC across various datasets. Code is available at https://github.com/SongtaoLiu0823/GADC.
more » « less
Full Text Available
Graph Adversarial Diffusion Convolution

Liu, Songtao; Chen, Jinghui; Fu, Tianfan; Lin, Lu; Zitnik, Marinka; Wu, Dinghao (July 2024, Proceedings of the 41st International Conference on Machine Learning (ICML))

This paper introduces a min-max optimization formulation for the Graph Signal Denoising (GSD) problem. In this formulation, we first maximize the second term of GSD by introducing perturbations to the graph structure based on Laplacian distance and then minimize the overall loss of the GSD. By solving the min-max optimization problem, we derive a new variant of the Graph Diffusion Convolution (GDC) architecture, called Graph Adversarial Diffusion Convolution (GADC). GADC differs from GDC by incorporating an additional term that enhances robustness against adversarial attacks on the graph structure and noise in node features. Moreover, GADC improves the performance of GDC on heterophilic graphs. Extensive experiments demonstrate the effectiveness of GADC across various datasets. Code is available at https://github.com/SongtaoLiu0823/GADC.
more » « less
Full Text Available
Pseudo-Labeling with Graph Active Learning for Few-shot Node Classification

https://doi.org/10.1109/ICDM58522.2023.00133

Li, Quan; Chen, Lingwei; Jing, Shixiong; Wu, Dinghao (December 2023, IEEE International Conference on Data Mining)

Graphs have emerged as one of the most important and powerful data structures to perform content analysis in many fields. In this line of work, node classification is a classic task, which is generally performed using graph neural networks (GNNs). Unfortunately, regular GNNs cannot be well generalized into the real-world application scenario when the labeled nodes are few. To address this challenge, we propose a novel few-shot node classification model that leverages pseudo-labeling with graph active learning. We first provide a theoretical analysis to argue that extra unlabeled data benefit few-shot classification. Inspired by this, our model proceeds by performing multi-level data augmentation with consistency and contrastive regularizations for better semi-supervised pseudo-labeling, and further devising graph active learning to facilitate pseudo-label selection and improve model effectiveness. Extensive experiments on four public citation networks have demonstrated that our model can effectively improve node classification accuracy with considerably few labeled data, which significantly outperforms all state-of-the-art baselines by large margins.
more » « less
Full Text Available
LibSteal: Model Extraction Attack towards Deep Learning Compilers by Reversing DNN Binary Library

https://doi.org/10.5220/0011754900003464

Zhang, Jinquan; Wang, Pei; Wu, Dinghao (January 2023, Proceedings of the 18th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE))

Full Text Available

« Prev Next »

Search for: All records