skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on January 7, 2026

Title: DMDb: Uncovering Criminal Hacking on the Dark Web to Enhance Cyber Threat Intelligence Research
The emergence of the dark web has enabled hackers to anonymously exchange information and trade malware worldwide, exposing organizations to an unprecedented number of threats. Without visibility into this offensive base, defenders are often left to mitigate damage. While prior cyber-threat intelligence research has been valuable, it has been constrained by incomplete, outdated, and noisy datasets. In this paper, we detail our efforts to build a comprehensive repository that illuminates the current plans of cyber-attackers. We achieve this by designing and deploying DarkMiner, a system that regularly scrapes the Tor network to populate the DarkMiner Database (DMDb). DMDb offers researchers a structured criminal hacking data collection enhanced with non-textual fields and object change tracking capabilities. To show its potential, we present three case studies analyzing: 1) cyber threat market fluctuations, 2) image-based vendor attribution, and 3) software vulnerability targeting.  more » « less
Award ID(s):
2246220
PAR ID:
10632599
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
Association for Information Systems (AIS)
Date Published:
Edition / Version:
58
ISBN:
978-0-9981331-8-8
Subject(s) / Keyword(s):
Dark web, hacking, threats, database.
Format(s):
Medium: X Other: pdf
Location:
Waikoloa, HI, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. Large enterprises are increasingly relying on threat detection softwares (e.g., Intrusion Detection Systems) to allow them to spot suspicious activities. These softwares generate alerts which must be investigated by cyber analysts to figure out if they are true attacks. Unfortunately, in practice, there are more alerts than cyber analysts can properly investigate. This leads to a “threat alert fatigue” or information overload problem where cyber analysts miss true attack alerts in the noise of false alarms. In this paper, we present NoDoze to combat this challenge using contextual and historical information of generated threat alert in an enterprise. NoDoze first generates a causal dependency graph of an alert event. Then, it assigns an anomaly score to each event in the dependency graph based on the frequency with which related events have happened before in the enterprise. NoDoze then propagates those scores along the edges of the graph using a novel network diffusion algorithm and generates a subgraph with an aggregate anomaly score which is used to triage alerts. Evaluation on our dataset of 364 threat alerts shows that NoDoze decreases the volume of false alarms by 86%, saving more than 90 hours of analysts’ time, which was required to investigate those false alarms. Furthermore, NoDoze generated dependency graphs of true alerts are 2 orders of magnitude smaller than those generated by traditional tools without sacrificing the vital information needed for the investigation. Our system has a low average runtime overhead and can be deployed with any threat detection software. 
    more » « less
  2. The frequency and costs of cyber-attacks are increasing each year. By the end of 2019, the total cost of data breaches is expected to reach $2.1 trillion through the evergrowing online presence of enterprises and their consumers. The tools to perform these attacks and the breached data can often be purchased within the Dark-net. Many of the threat actors within this realm use its various platforms to broker, discuss, and strategize these cyber-threat assets. To combat these attacks, researchers are developing Cyber-Threat Intelligence (CTI) tools to proactively monitor the ever-growing online hacker community. This paper will detail the creation and use of a CTI tool that leverages a social network to identify cyber-threats across major Dark-net data sources. Through this network, emerging threats can be quickly identified so proactive or reactive security measures can be implemented. 
    more » « less
  3. Cyber-defenders must account for users’ perceptions of attack consequence severity. However, research has yet to investigate such perceptions of a wide range of cyber-attack consequences. Thus, we had users rate the severity of 50 cyber-attack consequences. We then analyzed those ratings to a) understand perceived severity for each consequence, and b) compare perceived severity across select consequences. Further, we grouped ratings into the STRIDE threat model categories and c) analyzed whether perceived severity varied across those categories. The current study’s results suggest not all consequences are perceived to be equally severe; likewise, not all STRIDE threat model categories are perceived to be equally severe. Implications for designing warning messages and modeling threats are discussed. 
    more » « less
  4. With the rapid adoption of web services, the need to protect against various threats has become imperative for organizations operating in cyberspace. Organizations are increasingly opting to get financial cover in the event of losses due to a security incident. This helps them safeguard against the threat posed to third-party services that the organization uses. It is in the organization’s interest to understand the insurance requirements and procure all necessary direct and liability coverages. This helps transfer some risks to the insurance providers. However, cyber insurance policies often list details about coverages and exclusions using legalese that can be difficult to comprehend. Currently, it takes a significant manual effort to parse and extract knowledgeable rules from these lengthy and complicated policy documents. We have developed a semantically rich machine processable framework to automatically analyze cyber insurance policy and populate a knowledge graph that efficiently captures various inclusion and exclusion terms and rules embedded in the policy. In this paper, we describe this framework that has been built using technologies from AI, including Semantic Web, Modal/ Deontic Logic, and Natural Language Processing. We have validated our approach using industry standards proposed by the United States Federal Trade Commission (FTC) and applying it against publicly available policies of 7 cyber insurance vendors. Our system will enable cyber insurance seekers to automatically analyze various policy documents and make a well informed decision by identifying its inclusions and exclusions. 
    more » « less
  5. The accelerated growth of urban areas in the last decades has led to an unprecedented increase in the construction of wind-sensitive structures, e.g., long-span bridges, tall buildings, wind turbines, and solar trackers. To effectively control undesired wind- and earthquake-induced responses, a plethora of operational technology and cyber-physical systems have been introduced, including supervisory control and data acquisition systems, programmable logic controllers, and remote terminal units. All these systems are potential targets for cyberattacks and have already been attacked in other sectors, including energy, industry, education, and health. This study analyzes this threat to critical infrastructure, quantifies its potential damage, and develops possible countermeasures and cyber-defenses so the structural engineering community can effectively address this emerging challenge. 
    more » « less