NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

DESCG: data encoding scheme classification with GNN in binary analysis

https://doi.org/10.1007/s10515-025-00538-0

Dai, Xushu; Luo, Nanqing; Wang, Haizhou; Wang, Zhilong; Cao, Chen; Liu, Peng (July 2025, Automated Software Engineering)

Abstract Binary analysis, the process of examining software without its source code, plays a crucial role in understanding program behavior, e.g., evaluating the security properties of commercial software, and analyzing malware. One challenging aspect of this process is to classify data encoding schemes, such as encryption and compression, due to the absence of high-level semantic information. Existing approaches either rely on code similarity, which only works for known schemes, or heuristic rules, which lack scalability. In this paper, we propose DESCG, a novel deep learning-based method for automatically classifying four widely employed kinds of data encoding schemes in binary programs: encryption, compression, decompression, and hashing. Our approach leverages dynamic analysis to extract execution traces from binary programs, builds data dependency graphs from these traces, and incorporates critical feature engineering. By combining the specialized graph representation with the Graph Neural Network (GNN), our approach enables accurate classification without requiring prior knowledge of specific encoding schemes. The Evaluation result shows that DESCG achieves 97.7% accuracy and an F1 score of 97.67%, outperforming baseline models. We also conducted an extensive evaluation of DESCG to explore which feature is more important for it and examine its performance and overhead.
more » « less
Free, publicly-accessible full text available July 18, 2026
Tackling imbalanced data in cybersecurity with transfer learning: a case with ROP payload detection

https://doi.org/10.1186/s42400-022-00135-8

Wang, Haizhou; Singhal, Anoop; Liu, Peng (January 2023, Cybersecurity)

Abstract In recent years, deep learning gained proliferating popularity in the cybersecurity application domain, since when being compared to traditional machine learning methods, it usually involves less human efforts, produces better results, and provides better generalizability. However, the imbalanced data issue is very common in cybersecurity, which can substantially deteriorate the performance of the deep learning models. This paper introduces a transfer learning based method to tackle the imbalanced data issue in cybersecurity using return-oriented programming payload detection as a case study. We achieved 0.0290 average false positive rate, 0.9705 average F1 score and 0.9521 average detection rate on 3 different target domain programs using 2 different source domain programs, with 0 benign training data sample in the target domain. The performance improvement compared to the baseline is a trade-off between false positive rate and detection rate. Using our approach, the total number of false positives is reduced by 23.16%, and as a trade-off, the number of detected malicious samples decreases by 0.68%.
more » « less
To Protect the LLM Agent Against the Prompt Injection Attack with Polymorphic Prompt

https://doi.org/10.1109/DSN-S65789.2025.00037

Wang, Zhilong; Nagaraja, Neha; Zhang, Lan; Bahsi, Hayretdin; Patil, Pawan; Liu, Peng (June 2025, IEEE)

Free, publicly-accessible full text available June 23, 2026
IoT Firmware Emulation and Its Security Application in Fuzzing: A Critical Revisit

https://doi.org/10.3390/fi17010019

Zhou, Wei; Shen, Shandian; Liu, Peng (January 2025, Future Internet)

As IoT devices with microcontroller (MCU)-based firmware become more common in our lives, memory corruption vulnerabilities in their firmware are increasingly targeted by adversaries. Fuzzing is a powerful method for detecting these vulnerabilities, but it poses unique challenges when applied to IoT devices. Direct fuzzing on these devices is inefficient, and recent efforts have shifted towards creating emulation environments for dynamic firmware testing. However, unlike traditional software, firmware interactions with peripherals that are significantly more diverse presents new challenges for achieving scalable full-system emulation and effective fuzzing. This paper reviews 27 state-of-the-art works in MCU-based firmware emulation and its applications in fuzzing. Instead of classifying existing techniques based on their capabilities and features, we first identify the fundamental challenges faced by firmware emulation and fuzzing. We then revisit recent studies, organizing them according to the specific challenges they address, and discussing how each specific challenge is addressed. We compare the emulation fidelity and bug detection capabilities of various techniques to clearly demonstrate their strengths and weaknesses, aiding users in selecting or combining tools to meet their needs. Finally, we highlight the remaining technical gaps and point out important future research directions in firmware emulation and fuzzing.
more » « less
Free, publicly-accessible full text available January 1, 2026
Meta-reinforcement learning with universal policy adaptation: Provable near-optimality under all-task optimum comparator

Xu, Siyuan; Zhu, Minghui (December 2024, Conference on Neural Information Processing Systems)

Free, publicly-accessible full text available December 16, 2025
Evasive attacks against autoencoder-based cyberattack detection systems in power systems

https://doi.org/10.1016/j.egyai.2024.100381

Khaw, Yew Meng; Jahromi, Amir Abiri; Arani, Mohammadreza FM; Kundur, Deepa (September 2024, Energy and AI)

Full Text Available
Evaluating Large Language Models for Real-World Vulnerability Repair in C/C++ Code

https://doi.org/10.1145/3643651.3659892

Zhang, Lan; Zou, Qingtian; Singhal, Anoop; Sun, Xiaoyan; Liu, Peng (June 2024, ACM)

Full Text Available
Analysis of neural network detectors for network attacks

https://doi.org/10.3233/JCS-230031

Zou, Qingtian; Zhang, Lan; Singhal, Anoop; Sun, Xiaoyan; Liu, Peng (June 2024, Journal of Computer Security)

While network attacks play a critical role in many advanced persistent threat (APT) campaigns, an arms race exists between the network defenders and the adversary: to make APT campaigns stealthy, the adversary is strongly motivated to evade the detection system. However, new studies have shown that neural network is likely a game-changer in the arms race: neural network could be applied to achieve accurate, signature-free, and low-false-alarm-rate detection. In this work, we investigate whether the adversary could fight back during the next phase of the arms race. In particular, noticing that none of the existing adversarial example generation methods could generate malicious packets (and sessions) that can simultaneously compromise the target machine and evade the neural network detection model, we propose a novel attack method to achieve this goal. We have designed and implemented the new attack. We have also used Address Resolution Protocol (ARP) Poisoning and Domain Name System (DNS) Cache Poisoning as the case study to demonstrate the effectiveness of the proposed attack.
more » « less
Full Text Available
Using Explainable AI for Neural Network-Based Network Attack Detection

https://doi.org/10.1109/MC.2023.3342602

Zou, Qingtian; Zhang, Lan; Sun, Xiaoyan; Singhal, Anoop; Liu, Peng (May 2024, Computer)

Full Text Available
Online constrained meta-learning: Provable guarantees for generalization

Xu, Siyuan; Zhu, Minghui (December 2023, Conference on Neural Information Processing Systems)

Full Text Available

« Prev Next »

Search for: All records