skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: LINKING EXPLOITS FROM THE DARK WEB TO KNOWN VULNERABILITIES FOR PROACTIVE CYBER THREAT INTELLIGENCE: AN ATTENTION-BASED DEEP STRUCTURED SEMANTIC MODEL1
Black hat hackers use malicious exploits to circumvent security controls and take advantage of system vulnerabilities worldwide, costing the global economy over $450 billion annually. While many organizations are increasingly turning to cyber threat intelligence (CTI) to help prioritize their vulnerabilities, extant CTI processes are often criticized as being reactive to known exploits. One promising data source that can help develop proactive CTI is the vast and ever-evolving Dark Web. In this study, we adopted the computational design science paradigm to design a novel deep learning (DL)-based exploit-vulnerability attention deep structured semantic model (EVA-DSSM) that includes bidirectional processing and attention mechanisms to automatically link exploits from the Dark Web to vulnerabilities. We also devised a novel device vulnerability severity metric (DVSM) that incorporates the exploit post date and vulnerability severity to help cybersecurity professionals with their device prioritization and risk management efforts. We rigorously evaluated the EVA-DSSM against state-of-the-art non-DL and DL-based methods for short text matching on 52,590 exploit-vulnerability linkages across four testbeds: web application, remote, local, and denial of service. Results of these evaluations indicate that the proposed EVA-DSSM achieves precision at 1 scores 20% - 41% higher than non-DL approaches and 4% - 10% higher than DL-based approaches. We demonstrated the EVA-DSSM’s and DVSM’s practical utility with two CTI case studies: openly accessible systems in the top eight U.S. hospitals and over 20,000 Supervisory Control and Data Acquisition (SCADA) systems worldwide. A complementary user evaluation of the case study results indicated that 45 cybersecurity professionals found the EVA-DSSM and DVSM results more useful for exploit-vulnerability linking and risk prioritization activities than those produced by prevailing approaches. Given the rising cost of cyberattacks, the EVA-DSSM and DVSM have important implications for analysts in security operations centers, incident response teams, and cybersecurity vendors.  more » « less
Award ID(s):
1917117 1921485 1303362
PAR ID:
10336823
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
MIS quarterly
Volume:
46
Issue:
2
ISSN:
0276-7783
Page Range / eLocation ID:
911-946
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Black hat hackers use malicious exploits to circumvent security controls and take advantage of system vulnerabilities worldwide, costing the global economy over $450 billion annually. While many organizations are increasingly turning to cyber threat intelligence (CTI) to help prioritize their vulnerabilities, extant CTI processes are often criticized as being reactive to known exploits. One promising data source that can help develop proactive CTI is the vast and ever-evolving Dark Web. In this study, we adopted the computational design science paradigm to design a novel deep learning (DL)-based exploit-vulnerability attention deep structured semantic model (EVA-DSSM) that includes bidirectional processing and attention mechanisms to automatically link exploits from the Dark Web to vulnerabilities. We also devised a novel device vulnerability severity metric (DVSM) that incorporates the exploit post date and vulnerability severity to help cybersecurity professionals with their device prioritization and risk management efforts. We rigorously evaluated the EVA-DSSM against state-of-the-art non-DL and DL-based methods for short text matching on 52,590 exploit-vulnerability linkages across four testbeds: web application, remote, local, and denial of service. Results of these evaluations indicate that the proposed EVA-DSSM achieves precision at 1 scores 20%-41% higher than non-DL approaches and 4%-10% higher than DL-based approaches. We demonstrated the EVA-DSSM's and DVSM's practical utility with two CTI case studies: openly accessible systems in the top eight U.S. hospitals and over 20,000 Supervisory Control and Data Acquisition (SCADA) systems worldwide. A complementary user evaluation of the case study results indicated that 45 cybersecurity professionals found the EVA-DSSM and DVSM results more useful for exploit-vulnerability linking and risk prioritization activities than those produced by prevailing approaches. Given the rising cost of cyberattacks, the EVA-DSSM and DVSM have important implications for analysts in security operations centers, incident response teams, and cybersecurity vendors. 
    more » « less
  2. Cyber threat intelligence (CTI) is an actionable information or insight an organization uses to understand potential vulnerabilities it does have and threats it is facing. One important CTI for proactive cyber defense is exploit type with possible values system, web, network, website or Mobile. This study compares the performance of machine learning algorithms in predicating exploit types using form posts in the dark web, which is a semi- structured dataset collected from dark web. The study uses the CRISP data science approach. The results of the study show that machine learning algorithms which are function-based including support vector machine and deep-learning using artificial neural network are more accurate than those algorithms which are based on tree including Random Forest and Decision-Tree for CTI discovery from semi-structured dataset. Future research will include the use of high-performance computing and advanced deep-learning algorithms. 
    more » « less
  3. null (Ed.)
    The implementation of Internet of Things (IoT) devices in medical environments, has introduced a growing list of security vulnerabilities and threats. The lack of an extensible big data resource that captures medical device vulnerabilities limits the use of Artificial Intelligence (AI) based cyber defense systems in capturing, detecting, and preventing known and future attacks. We describe a system that generates a repository of Cyber Threat Intelligence (CTI) about various medical devices and their known vulnerabilities from sources such as manufacturer and ICS-CERT vulnerability alerts. We augment the intelligence repository with data sources such as Wikidata and public medical databases. The combined resources are integrated with threat intelligence in our Cybersecurity Knowledge Graph (CKG) from previous research. The augmented graph embeddings are useful in querying relevant information and can help in various AI assisted cybersecurity tasks. Given the integration of multiple resources, we found the augmented CKG produced higher quality graph representations. The augmented CKG produced a 31% increase in the Mean Average Precision (MAP) value, computed over an information retrieval task. 
    more » « less
  4. Operating systems play a crucial role in computer systems, serving as the fundamental infrastructure that supports a wide range of applications and services. However, they are also prime targets for malicious actors seeking to exploit vulnerabilities and compromise system security. This is a crucial area that requires active research; however, OS vulnerabilities have not been actively studied in recent years. Therefore, we conduct a comprehensive analysis of OS vulnerabilities, aiming to enhance the understanding of their trends, severity, and common weaknesses. Our research methodology encompasses data preparation, sampling of vulnerable OS categories and versions, and an in-depth analysis of trends, severity levels, and types of OS vulnerabilities. We scrape the high-level data from reliable and recognized sources to generate two refined OS vulnerability datasets: one for OS categories and another for OS versions. Our study reveals the susceptibility of popular operating systems such as Windows, Windows Server, Debian Linux, and Mac OS. Specifically, Windows 10, Windows 11, Android (v11.0, v12.0, v13.0), Windows Server 2012, Debian Linux (v10.0, v11.0), Fedora 37, and HarmonyOS 2, are identified as the most vulnerable OS versions in recent years (2021–2022). Notably, these vulnerabilities exhibit a high severity, with maximum CVSS scores falling into the 7–8 and 9–10 range. Common vulnerability types, including CWE-119, CWE-20, CWE-200, and CWE-787, are prevalent in these OSs and require specific attention from OS vendors. The findings on trends, severity, and types of OS vulnerabilities from this research will serve as a valuable resource for vendors, security professionals, and end-users, empowering them to enhance OS security measures, prioritize vulnerability management efforts, and make informed decisions to mitigate risks associated with these vulnerabilities. 
    more » « less
  5. As the number and severity of security incidents continue to increase, remediating vulnerabilities and weaknesses has become a daunting task due to the sheer number of known vulnerabilities. Different scoring systems have been developed to provide qualitative and quantitative assessments of the severity of common vulnerabilities and weaknesses, and guide the prioritization of vulnerability remediation. However, these scoring systems provide only generic rankings of common weaknesses, which do not consider the specific vulnerabilities that exist in each system. To address this limitation, and building on recent principled approaches to vulnerability scoring, we propose new common weakness scoring metrics that consider the findings of vulnerability scanners, including the number of instances of each vulnerability across a system, and enable system-specific rankings that can provide actionable intelligence to security administrators. We built a small testbed to evaluate the proposed metrics against an existing metric, and show that the results are consistent with our intuition. 
    more » « less