We identify over a quarter of a million domains used by medium and large companies within the .com registry. We find that for around 7% of these companies very similar domain names have been registered with character changes that are intended to be indistinguishable at a casual glance. These domains would be suitable for use in Business Email Compromise frauds. Using historical registration and name server data we identify the timing, rate, and movement of these look-alike domains over a ten year period. This allows us to identify clusters of registrations which are quite clearly malicious and show how the criminals have moved their activity over time in response to countermeasures. Although the malicious activity peaked in 2016, there is still sufficient ongoing activity to cause concern.
Risky BIZness: risks derived from registrar name management
In this paper, we explore a domain hijacking vulnerability that is an accidental byproduct of undocumented operational practices between domain registrars and registries. We show how over the last nine years over 512K domains have been implicitly exposed to the risk of hijacking, affecting names in most popular TLDs (including .com and .net) as well as legacy TLDs with tight registration control (such as .edu and .gov). Moreover, we show that this weakness has been actively exploited by multiple parties who, over the years, have assumed control over 163K domains without having any ownership interest in those names. In addition to characterizing the nature and size of this problem, we also report on the efficacy of the remediation in response to our outreach with registrars.
- Award ID(s):
- 1724853
- Publication Date:
- NSF-PAR ID:
- 10351119
- Journal Name:
- Proceedings of the 21st ACM Internet Measurement Conference (IMC '21)
- Page Range or eLocation-ID:
- 673 to 686
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Take-down operations aim to disrupt cybercrime involving malicious domains. In the past decade, many successful take-down operations have been reported, including those against the Conficker worm, and most recently, against VPNFilter. Although it plays an important role in fighting cybercrime, the domain take-down procedure is still surprisingly opaque. There seems to be no in-depth understanding about how the take-down operation works and whether there is due diligence to ensure its security and reliability. In this paper, we report the first systematic study on domain takedown. Our study was made possible via a large collection of data, including various sinkhole feeds and blacklists, passive DNS data spanning six years, and historical WHOIS information. Over these datasets, we built a unique methodology that extensively used various reverse lookups and other data analysis techniques to address the challenges in identifying taken-down domains, sinkhole operators, and take-down durations. Applying the methodology on the data, we discovered over 620K takendown domains and conducted a longitudinal analysis on the take-down process, thus facilitating a better understanding of the operation and its weaknesses. We found that more than 14% of domains taken-down over the past ten months have been released back to the domain market and thatmore »
-
Photo Sleuth: Combining Collective Intelligence and Computer Vision to Identify Historical PortraitsIdentifying people in photographs is a critical task in a wide variety of domains, from national security [7] to journalism [14] to human rights investigations [1]. The task is also fundamentally complex and challenging. With the world population at 7.6 billion and growing, the candidate pool is large. Studies of human face recognition ability show that the average person incorrectly identifies two people as similar 20–30% of the time, and trained police detectives do not perform significantly better [11]. Computer vision-based face recognition tools have gained considerable ground and are now widely available commercially, but comparisons to human performance show mixed results at best [2,10,16]. Automated face recognition techniques, while powerful, also have constraints that may be impractical for many real-world contexts. For example, face recognition systems tend to suffer when the target image or reference images have poor quality or resolution, as blemishes or discolorations may be incorrectly recognized as false positives for facial landmarks. Additionally, most face recognition systems ignore some salient facial features, like scars or other skin characteristics, as well as distinctive non-facial features, like ear shape or hair or facial hair styles. This project investigates how we can overcome these limitations to support person identificationmore »
-
Margueron R ; Holoch D (Ed.)Dynamic posttranslational modifications to canonical histones that constitute the nucleosome (H2A, H2B, H3, and H4) control all aspects of enzymatic transactions with DNA. Histone methylation has been studied heavily for the past 20 years, and our mechanistic understanding of the control and function of individual methylation events on specific histone arginine and lysine residues has been greatly improved over the past decade, driven by excellent new tools and methods. Here, we will summarize what is known about the distribution and some of the functions of protein methyltransferases from all major eukaryotic supergroups. The main conclusion is that protein, and specifically histone, methylation is an ancient process. Many taxa in all supergroups have lost some subfamilies of both protein arginine methyltransferases (PRMT) and the heavily studied SET domain lysine methyltransferases (KMT). Over time, novel subfamilies, especially of SET domain proteins, arose. We use the interactions between H3K27 and H3K36 methylation as one example for the complex circuitry of histone modifications that make up the “histone code,” and we discuss one recent example (Paramecium Ezl1) for how extant enzymes that may resemble more ancient SET domain KMTs are able to modify two lysine residues that have divergent functions in plants, fungi, andmore »
-
The decompiler is one of the most common tools for examining executable binaries without the corresponding source code. It transforms binaries into high-level code, reversing the compilation process. Unfortunately, decompiler output is far from readable because the decompilation process is often incomplete. State-of-the-art techniques use machine learning to predict missing information like variable names. While these approaches are often able to suggest good variable names in context, no existing work examines how the selection of training data influences these machine learning models. We investigate how data provenance and the quality of training data affect performance, and how well, if at all, trained models generalize across software domains. We focus on the variable renaming problem using one such machine learning model, DIRE . We first describe DIRE in detail and the accompanying technique used to generate training data from raw code. We also evaluate DIRE ’s overall performance without respect to data quality. Next, we show how training on more popular, possibly higher quality code (measured using GitHub stars) leads to a more generalizable model because popular code tends to have more diverse variable names. Finally, we evaluate how well DIRE predicts domain-specific identifiers, propose a modification to incorporate domain information,more »