skip to main content


Title: Reading the Tea Leaves: A Comparative Analysis of Threat Intelligence
The term "threat intelligence" has swiftly become a staple buzzword in the computer security industry. The entirely reasonable premise is that, by compiling up-to-date information about known threats (i.e., IP addresses, domain names, file hashes, etc.), recipients of such information may be able to better defend their systems from future attacks. Thus, today a wide array of public and commercial sources distribute threat intelligence data feeds to support this purpose. However, our understanding of this data, its characterization and the extent to which it can meaningfully support its intended uses, is still quite limited. In this paper, we address these gaps by formally defining a set of metrics for characterizing threat intelligence data feeds and using these measures to systematically characterize a broad range of public and commercial sources. Further, we ground our quantitative assessments using external measurements to qualitatively investigate issues of coverage and accuracy. Unfortunately, our measurement results suggest that there are significant limitations and challenges in using existing threat intelligence data for its purported goals.  more » « less
Award ID(s):
1705050 1629973
PAR ID:
10100961
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
USENIX Security Symposium
Page Range / eLocation ID:
1-17
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    One of the staples of network defense is blocking traffic to and from a list of "known bad" sites on the Internet. However, few organizations are in a position to produce such a list themselves, so pragmatically this approach depends on the existence of third-party "threat intelligence" providers who specialize in distributing feeds of unwelcome IP addresses. However, the choice to use such a strategy, let alone which data feeds are trusted for this purpose, is rarely made public and thus little is understood about the deployment of these techniques in the wild. To explore this issue, we have designed and implemented a technique to infer proactive traffic blocking on a remote host and, through a series of measurements, to associate that blocking with the use of particular IP blocklists. In a pilot study of 220K US hosts, we find as many as one fourth of the hosts appear to blocklist based on some source of threat intelligence data, and about 2% use one of the 9 particular third-party blocklists that we evaluated. 
    more » « less
  2. null (Ed.)
    The implementation of Internet of Things (IoT) devices in medical environments, has introduced a growing list of security vulnerabilities and threats. The lack of an extensible big data resource that captures medical device vulnerabilities limits the use of Artificial Intelligence (AI) based cyber defense systems in capturing, detecting, and preventing known and future attacks. We describe a system that generates a repository of Cyber Threat Intelligence (CTI) about various medical devices and their known vulnerabilities from sources such as manufacturer and ICS-CERT vulnerability alerts. We augment the intelligence repository with data sources such as Wikidata and public medical databases. The combined resources are integrated with threat intelligence in our Cybersecurity Knowledge Graph (CKG) from previous research. The augmented graph embeddings are useful in querying relevant information and can help in various AI assisted cybersecurity tasks. Given the integration of multiple resources, we found the augmented CKG produced higher quality graph representations. The augmented CKG produced a 31% increase in the Mean Average Precision (MAP) value, computed over an information retrieval task. 
    more » « less
  3. In light of the numerous peculiar events that persistently challenge the world, it is paramount to possess the capacity to thoroughly analyze the realm of cyberspace and cyber threats in the context of these circumstances. As such, adequately integrating data-driven intelligence in cyber analytics can help strengthen security postures and enable effective decision making. In this paper, we introduce a multifaceted Internet-scale, data-driven framework to enable the consistent measurement, identification and characterization of cyber threat dynamics amid real-world events. Particularly, our proposed framework scrutinizes Internet-wide security data feeds from multiple sources, including, (i) a large network telescope to infer illicit activities at large, (ii) a cluster of globally distributed sensor and honeypot to quantify reflective amplification attempts, and (iii) a set of BGP collectors to analyze Remotely Triggered Black Hole (RTBH) events. Specifically, we employ our framework to shed light on the 2022 Russo-Ukrainian cyber threat activities by drawing upon Terabytes of real network and security data feeds. We infer DDoS and UDP reflective attacks targeting federal agencies in Russia, and media entities in Ukraine. We further perceive an upsurge of Russian and Ukrainian RTBH techniques employed to block attacks targeting. ru domains and media companies. Additionally, we uncover an escalation of reconnaissance events, some of which are generated by the IoT-centric Mirai malware and others which target critical infrastructure. We report our findings objectively while postulating thoughts on intriguing observations on that particular event. Our Internet-scale data-driven framework offers a robust approach for empirical analysis of cyber threats in the face of real-world challenges; enabling effective and well-informed decision making. 
    more » « less
  4. Cyber Threat Intelligence (CTI) is information describing threat vectors, vulnerabilities, and attacks and is often used as training data for AI-based cyber defense systems such as Cybersecurity Knowledge Graphs (CKG). There is a strong need to develop community-accessible datasets to train existing AI-based cybersecurity pipelines to efficiently and accurately extract meaningful insights from CTI. We have created an initial unstructured CTI corpus from a variety of open sources that we are using to train and test cybersecurity entity models using the spaCy framework and exploring self-learning methods to automatically recognize cybersecurity entities. We also describe methods to apply cybersecurity domain entity linking with existing world knowledge from Wikidata. Our future work will survey and test spaCy NLP tools, and create methods for continuous integration of new information extracted from text. 
    more » « less
  5. The open radio access network (O-RAN) offers new degrees of freedom for building and operating advanced cellular networks. Emphasizing on RAN disaggregation, open interfaces, multi-vendor support, and RAN intelligent controllers (RICs), O-RAN facilitates adaptation to new applications and technology trends. Yet, this architecture introduces new security challenges. This article proposes leveraging zero trust principles for O-RAN security. We introduce zero trust RAN (ZTRAN), which embeds service authentication, intrusion detection, and secure slicing subsystems that are encapsulated as xApps. We implement ZTRAN on the open artificial intelligence cellular (OAIC) research platform and demonstrate its feasibility and effectiveness in terms of legitimate user throughput and latency figures. Our experimental analysis illustrates how ZTRAN's intrusion detection and secure slicing microservices operate effectively and in concert as part of O-RAN Alliance's containerized near-real time RIC. Research directions include exploring machine learning and additional threat intelligence feeds for improving the performance and extending the scope of ZTRAN. 
    more » « less