Detecting vulnerable code blocks has become a highly popular topic in computer-aided design, especially with the advancement of natural language processing (NLP). Analyzing hardware description languages ( HDLs ), such as Verilog, involves dealing with lengthy code. This letter introduces an innovative identification of attack-vulnerable hardware by the use of opcode processing. Leveraging the advantage of architecturally defined opcodes and expressing all operations at the beginning of each code line, the word processing problem is efficiently transformed into opcode processing. This research converts a benchmark dataset into an intermediary code stack, subsequently classifying secure and fragile codes using NLP techniques. The results reveal a framework that achieves up to 94% accuracy when employing sophisticated convolutional neural networks (CNNs) architecture with extra embedding layers. Thus, it provides a means for users to quickly verify the vulnerability of their HDL code by inspecting a supervised learning model trained on the predefined vulnerabilities. It also supports the superior efficacy of opcode -based processing in Trojan detection by analyzing the outcomes derived from a model trained using the HDL dataset.
more »
« less
Security Vulnerability Detection Using Deep Learning Natural Language Processing
Detecting security vulnerabilities in software before they are exploited has been a challenging problem for decades. Traditional code analysis methods have been proposed, but are often ineffective and inefficient. In this work, we model software vulnerability detection as a natural language processing (NLP) problem with source code treated as texts, and address the auto-mated software venerability detection with recent advanced deep learning NLP models assisted by transfer learning on written English. For training and testing, we have preprocessed the NIST NVD/SARD databases and built a dataset of over 100,000 files in C programming language with 123 types of vulnerabilities. The extensive experiments generate the best performance of over 93% accuracy in detecting security vulnerabilities.
more »
« less
- Award ID(s):
- 2109971
- PAR ID:
- 10287750
- Date Published:
- Journal Name:
- IEEE INFOCOM WKSHPS: BigSecurity 2021: International Workshop on Security and Privacy in Big Data
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Recent years have witnessed the rise of Internet-of-Things (IoT) based cyber attacks. These attacks, as expected, are launched from compromised IoT devices by exploiting security flaws already known. Less clear, however, are the fundamental causes of the pervasiveness of IoT device vulnerabilities and their security implications, particularly in how they affect ongoing cybercrimes. To better understand the problems and seek effective means to suppress the wave of IoT-based attacks, we conduct a comprehensive study based on a large number of real-world attack traces collected from our honeypots, attack tools purchased from the underground, and information collected from high-profile IoT attacks. This study sheds new light on the device vulnerabilities of today's IoT systems and their security implications: ongoing cyber attacks heavily rely on these known vulnerabilities and the attack code released through their reports; on the other hand, such a reliance on known vulnerabilities can actually be used against adversaries. The same bug reports that enable the development of an attack at an exceedingly low cost can also be leveraged to extract vulnerability-specific features that help stop the attack. In particular, we leverage Natural Language Processing (NLP) to automatically collect and analyze more than 7,500 security reports (with 12,286 security critical IoT flaws in total) scattered across bug-reporting blogs, forums, and mailing lists on the Internet. We show that signatures can be automatically generated through an NLP-based report analysis, and be used by intrusion detection or firewall systems to effectively mitigate the threats from today's IoT-based attacks.more » « less
-
Recent years have witnessed the rise of Internet-of-Things (IoT) based cyber attacks. These attacks, as expected, are launched from compromised IoT devices by exploiting security flaws already known. Less clear, however, are the fundamental causes of the pervasiveness of IoT device vulnerabilities and their security implications, particularly in how they affect ongoing cybercrimes. To better understand the problems and seek effective means to suppress the wave of IoT-based attacks, we conduct a comprehensive study based on a large number of real-world attack traces collected from our honeypots, attack tools purchased from the underground, and information collected from high-profile IoT attacks. This study sheds new light on the device vulnerabilities of today’s IoT systems and their security implications: ongoing cyber attacks heavily rely on these known vulnerabilities and the attack code released through their reports; on the other hand, such a reliance on known vulnerabilities can actually be used against adversaries. The same bug reports that enable the development of an attack at an exceedingly low cost can also be leveraged to extract vulnerability-specific features that help stop the attack. In particular, we leverage Natural Language Processing (NLP) to automatically collect and analyze more than 7,500 security reports (with 12,286 security critical IoT flaws in total) scattered across bug-reporting blogs, forums, and mailing lists on the Internet. We show that signatures can be automatically generated through an NLP-based report analysis, and be used by intrusion detection or firewall systems to effectively mitigate the threats from today’s IoT-based attacks.more » « less
-
Detecting software vulnerabilities has been a challenge for decades. Many techniques have been developed to detect vulnerabilities by reporting whether a vulnerability exists in the code of software. But few of them have the capability to categorize the types of detected vulnerabilities, which is crucial for human developers or other tools to analyze and address vulnerabilities. In this paper, we present our work on identifying the types of vulnerabilities using deep learning. Our data consists of code slices parsed in a manner that captures the syntax and semantics of a vulnerability, sourced from prior work. We train deep neural networks on these features to perform multiclass classification of software vulnerabilities in the dataset. Our experiments show that our models can effectively identify the vulnerability classes of the vulnerable functions in our dataset.more » « less
-
This paper presents an innovative approach to DevOps security education, addressing the dynamic landscape of cybersecurity threats. We propose a student-centered learning methodology by developing comprehensive hands-on learning modules. Specifically, we introduce labware modules designed to automate static security analysis, empowering learners to identify known vulnerabilities efficiently. These modules offer a structured learning experience with pre-lab, hands-on, and post-lab sections, guiding students through DevOps concepts and security challenges. In this paper, we introduce hands-on learning modules that familiarize students with recognizing known security flaws through the application of Git Hooks. Through practical exercises with real-world code examples containing security flaws, students gain proficiency in detecting vulnerabilities using relevant tools. Initial evaluations conducted across educational institutions indicate that these hands-on modules foster student interest in software security and cybersecurity and equip them with practical skills to address DevOps security vulnerabilities.more » « less
An official website of the United States government

