skip to main content

Title: Clairvoyance: Inferring Blocklist Use on the Internet
One of the staples of network defense is blocking traffic to and from a list of "known bad" sites on the Internet. However, few organizations are in a position to produce such a list themselves, so pragmatically this approach depends on the existence of third-party "threat intelligence" providers who specialize in distributing feeds of unwelcome IP addresses. However, the choice to use such a strategy, let alone which data feeds are trusted for this purpose, is rarely made public and thus little is understood about the deployment of these techniques in the wild. To explore this issue, we have designed and implemented a technique to infer proactive traffic blocking on a remote host and, through a series of measurements, to associate that blocking with the use of particular IP blocklists. In a pilot study of 220K US hosts, we find as many as one fourth of the hosts appear to blocklist based on some source of threat intelligence data, and about 2% use one of the 9 particular third-party blocklists that we evaluated.
; ; ; ;
Award ID(s):
1705050 1629973
Publication Date:
Journal Name:
Passive and Active Measurement (PAM 2021)
Page Range or eLocation-ID:
57 - 75
Sponsoring Org:
National Science Foundation
More Like this
  1. The term "threat intelligence" has swiftly become a staple buzzword in the computer security industry. The entirely reasonable premise is that, by compiling up-to-date information about known threats (i.e., IP addresses, domain names, file hashes, etc.), recipients of such information may be able to better defend their systems from future attacks. Thus, today a wide array of public and commercial sources distribute threat intelligence data feeds to support this purpose. However, our understanding of this data, its characterization and the extent to which it can meaningfully support its intended uses, is still quite limited. In this paper, we address these gaps by formally defining a set of metrics for characterizing threat intelligence data feeds and using these measures to systematically characterize a broad range of public and commercial sources. Further, we ground our quantitative assessments using external measurements to qualitatively investigate issues of coverage and accuracy. Unfortunately, our measurement results suggest that there are significant limitations and challenges in using existing threat intelligence data for its purported goals.
  2. Today’s mobile apps employ third-party advertising and tracking (A&T) libraries, which may pose a threat to privacy. State-of-the-art detects and blocks outgoing A&T HTTP/S requests by using manually curated filter lists (e.g. EasyList), and recently, using machine learning approaches. The major bottleneck of both filter lists and classifiers is that they rely on experts and the community to inspect traffic and manually create filter list rules that can then be used to block traffic or label ground truth datasets. We propose NoMoATS – a system that removes this bottleneck by reducing the daunting task of manually creating filter rules, to the much easier and scalable task of labeling A&T libraries. Our system leverages stack trace analysis to automatically label which network requests are generated by A&T libraries. Using NoMoATS, we collect and label a new mobile traffic dataset. We use this dataset to train decision tree classifiers, which can be applied in real-time on the mobile device and achieve an average F-score of 93%. We show that both our automatic labeling and our classifiers discover thousands of requests destined to hundreds of different hosts, previously undetected by popular filter lists. To the best of our knowledge, our system is themore »first to (1) automatically label which mobile network requests are engaged in A&T, while requiring to only manually label libraries to their purpose and (2) apply on-device machine learning classifiers that operate at the granularity of URLs, can inspect connections across all apps, and detect not only ads, but also tracking.« less
  3. Abstract Despite the prevalence of Internet of Things (IoT) devices, there is little information about the purpose and risks of the Internet traffic these devices generate, and consumers have limited options for controlling those risks. A key open question is whether one can mitigate these risks by automatically blocking some of the Internet connections from IoT devices, without rendering the devices inoperable. In this paper, we address this question by developing a rigorous methodology that relies on automated IoT-device experimentation to reveal which network connections (and the information they expose) are essential, and which are not. We further develop strategies to automatically classify network traffic destinations as either required ( i.e. , their traffic is essential for devices to work properly) or not, hence allowing firewall rules to block traffic sent to non-required destinations without breaking the functionality of the device. We find that indeed 16 among the 31 devices we tested have at least one blockable non-required destination, with the maximum number of blockable destinations for a device being 11. We further analyze the destination of network traffic and find that all third parties observed in our experiments are blockable, while first and support parties are neither uniformly requiredmore »or non-required. Finally, we demonstrate the limitations of existing blocklists on IoT traffic, propose a set of guidelines for automatically limiting non-essential IoT traffic, and we develop a prototype system that implements these guidelines.« less
  4. State-of-the-art System-on-Chip (SoC) designs consist of many Intellectual Property (IP) cores that interact using a Network-on-Chip (NoC) architecture. SoC designers increasingly rely on global supply chains for obtaining third-party IPs. In addition to inherent vulnerabilities associated with utilizing third-party IPs, NoC based SoCs enable attackers to exploit the distributed nature of NoC and its connectivity with various IPs to launch a plethora of attacks. Specifically, Denial-of-Service (DoS) attacks pose a serious threat in degrading the SoC performance by flooding the NoC with unnecessary packets. In this paper, we present a machine learning-based runtime monitoring mechanism to detect DoS attacks. The models are statically trained and used for runtime attack detection leading to minimum runtime performance overhead. Our approach is capable of detecting DoS attacks with high accuracy, even in the presence of unpredictable NoC traffic patterns caused by various application mappings. We extensively explore machine learning models and features to provide a comprehensive study on how to use machine learning for DoS attack detection in NoC-based SoCs.
  5. Abstract Smart DNS (SDNS) services advertise access to geofenced content (typically, video streaming sites such as Netflix or Hulu) that is normally inaccessible unless the client is within a prescribed geographic region. SDNS is simple to use and involves no software installation. Instead, it requires only that users modify their DNS settings to point to an SDNS resolver. The SDNS resolver “smartly” identifies geofenced domains and, in lieu of their proper DNS resolutions, returns IP addresses of proxy servers located within the geofence. These servers then transparently proxy traffic between the users and their intended destinations, allowing for the bypass of these geographic restrictions. This paper presents the first academic study of SDNS services. We identify a number of serious and pervasive privacy vulnerabilities that expose information about the users of these systems. These include architectural weaknesses that enable content providers to identify which requesting clients use SDNS. Worse, we identify flaws in the design of some SDNS services that allow any arbitrary third party to enumerate these services’ users (by IP address), even if said users are currently offline. We present mitigation strategies to these attacks that have been adopted by at least one SDNS provider in response tomore »our findings.« less