Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Internet censorship is pervasive, with significant effort dedicated to understanding what is censored, and where. Prior censorship measurements however have identified significant inconsistencies in their results; experiments show unexplained non-deterministic behaviors thought to be caused by censor load, end-host geographic diversity, or incomplete censorship—inconsistencies which impede reliable, repeatable and correct understanding of global censorship. In this work we investigate the extent to which Equal-cost Multi-path (ECMP) routing is the cause for these inconsistencies, developing methods to measure and compensate for them. We find that ECMP routing significantly changes observed censorship across protocols, censor mechanisms, and in 18 countries. We identify that previously observed non-determinism or regional variations are attributable to measurements between fixed endhosts taking different routes based on Flow-ID; i.e., choice of intrasubnet source IP or ephemeral source port leads to differences in observed censorship. To achieve this we develop new route-stable censorship measurement methods that allow consistent measurement of DNS, HTTP, and HTTPS censorship. We find ECMP routing yields censorship changes across 42% of IPs and 51% of ASes, but that impact is not uniform. We develop an application-level traceroute tool to construct network paths using specific censored packets, leading us to identify numerous causes of the behavior, ranging from likely failed infrastructure, to routes to the same end-host taking geographically diverse paths which experience differences in censorship en-route. Finally, we compare our results to prior global measurements, demonstrating prior studies were possibly impacted by this phenomenon, and that specific results are explainable by ECMP routing. Our work points to methods for improving future studies, reducing inconsistencies and increasing repeatabilitymore » « less
-
Internet-wide scanning is a critical tool for security researchers and practitioners alike. By exhaustively exploring the entire IPv4 address space, Internet scanning has driven the development of new security protocols, found and tracked vulnerabilities, improved DDoS defenses, and illuminated global censorship. Unfortunately, the vast scale of the IPv6 address space—340 trillion trillion trillion addresses—precludes exhaustive scanning, necessitating entirely new IPv6-specific scanning methods. As IPv6 adoption continues to grow, developing IPv6 scanning methods is vital for maintaining our capability to comprehensively investigate Internet security. We present 6SENSE, an end-to-end Internet-wide IPv6 scanning system. 6SENSE utilizes reinforcement learning coupled with an online scanner to iteratively reduce the space of possible IPv6 addresses into a tractable scannable subspace, thus discovering new IPv6 Internet hosts. 6SENSE is driven by a set of metrics we identify and define as key for evaluating the generality, diversity, and correctness of IPv6 scanning. We evaluate 6SENSE and prior generative IPv6 discovery methods across these metrics, showing that 6SENSE is able to identify tens of millions of IPv6 hosts, which compared to prior approaches, is up to 3.6x more hosts and 4x more end-site assignments, across a more diverse set of networks. From our analysis, we identify limitations in prior generative approaches that preclude their use for Internet-scale security scans. We also conduct the first Internet-wide scanning-driven security analysis of IPv6 hosts, focusing on TLS certificates unique to IPv6, surveying open ports and security-sensitive services, and identifying potential CVEs.more » « less
-
The term "threat intelligence" has swiftly become a staple buzzword in the computer security industry. The entirely reasonable premise is that, by compiling up-to-date information about known threats (i.e., IP addresses, domain names, file hashes, etc.), recipients of such information may be able to better defend their systems from future attacks. Thus, today a wide array of public and commercial sources distribute threat intelligence data feeds to support this purpose. However, our understanding of this data, its characterization and the extent to which it can meaningfully support its intended uses, is still quite limited. In this paper, we address these gaps by formally defining a set of metrics for characterizing threat intelligence data feeds and using these measures to systematically characterize a broad range of public and commercial sources. Further, we ground our quantitative assessments using external measurements to qualitatively investigate issues of coverage and accuracy. Unfortunately, our measurement results suggest that there are significant limitations and challenges in using existing threat intelligence data for its purported goals.more » « less