skip to main content

Title: Trufflehunter: Cache Snooping Rare Domains at Large Public DNS Resolvers
This paper presents and evaluates Trufflehunter, a DNS cache snooping tool for estimating the prevalence of rare and sensitive Internet applications. Unlike previous efforts that have focused on small, misconfigured open DNS resolvers, Trufflehunter models the complex behavior of large multi-layer distributed caching infrastructures (e.g., such as Google Public DNS). In particular, using controlled experiments, we have inferred the caching strategies of the four most popular public DNS resolvers (Google Public DNS, Cloudflare Quad1, OpenDNS and Quad9). The large footprint of such resolvers presents an opportunity to observe rare domain usage, while preserving the privacy of the users accessing them. Using a controlled testbed, we evaluate how accurately Trufflehunter can estimate domain name usage across the U.S. Applying this technique in the wild, we provide a lower-bound estimate of the popularity of several rare and sensitive applications (most notably smartphone stalkerware) which are otherwise challenging to survey.
; ; ; ; ; ;
Award ID(s):
1705050 1629973 1724853
Publication Date:
Journal Name:
Proceedings of the Internet Measurement Conference (IMC'20)
Page Range or eLocation-ID:
50 to 64
Sponsoring Org:
National Science Foundation
More Like this
  1. Content delivery networks (CDNs) commonly use DNS to map end-users to the best edge servers. A recently proposed EDNS0-Client-Subnet (ECS) extension allows recursive resolvers to include end-user subnet information in DNS queries, so that authoritative DNS servers, especially those belonging to CDNs, could use this information to improve user mapping. In this paper, we study the ECS behavior of ECS-enabled recursive resolvers from the perspectives of the opposite sides of a DNS interaction, the authoritative DNS servers of a major CDN and a busy DNS resolution service. We find a range of erroneous (i.e., deviating from the protocol specification) and detrimental (even if compliant) behaviors that may unnecessarily erode client privacy, reduce the effectiveness of DNS caching, diminish ECS benefits, and in some cases turn ECS from facilitator into an obstacle to authoritative DNS servers' ability to optimize user-to-edge-server mappings.
  2. Machine learning is an expanding field with an ever-increasing role in everyday life, with its utility in the industrial, agricultural, and medical sectors being undeniable. Recently, this utility has come in the form of machine learning implementation on embedded system devices. While there have been steady advances in the performance, memory, and power consumption of embedded devices, most machine learning algorithms still have a very high power consumption and computational demand, making the implementation of embedded machine learning somewhat difficult. However, different devices can be implemented for different applications based on their overall processing power and performance. This paper presents an overview of several different implementations of machine learning on embedded systems divided by their specific device, application, specific machine learning algorithm, and sensors. We will mainly focus on NVIDIA Jetson and Raspberry Pi devices with a few different less utilized embedded computers, as well as which of these devices were more commonly used for specific applications in different fields. We will also briefly analyze the specific ML models most commonly implemented on the devices and the specific sensors that were used to gather input from the field. All of the papers included in this review were selected using Googlemore »Scholar and published papers in the IEEExplore database. The selection criterion for these papers was the usage of embedded computing systems in either a theoretical study or practical implementation of machine learning models. The papers needed to have provided either one or, preferably, all of the following results in their studies—the overall accuracy of the models on the system, the overall power consumption of the embedded machine learning system, and the inference time of their models on the embedded system. Embedded machine learning is experiencing an explosion in both scale and scope, both due to advances in system performance and machine learning models, as well as greater affordability and accessibility of both. Improvements are noted in quality, power usage, and effectiveness.« less
  3. Hohlfeld, O ; Moura, G ; Pelsser, C. (Ed.)
    While the DNS protocol encompasses both UDP and TCP as its underlying transport, UDP is commonly used in practice. At the same time, increasingly large DNS responses and concerns over amplification denial of service attacks have heightened interest in conducting DNS interactions over TCP. This paper surveys the support for DNS-over-TCP in the deployed DNS infrastructure from several angles. First, we assess resolvers responsible for over 66.2% of the external DNS queries that arrive at a major content delivery network (CDN). We find that 2.7% to 4.8% of the resolvers, contributing around 1.1% to 4.4% of all queries arriving at the CDN from the resolvers we study, do not properly fallback to TCP when instructed by authoritative DNS servers. Should a content provider decide to employ TCP-fallback as the means of switching to DNS-over-TCP, it faces the corresponding loss of its customers. Second, we assess authoritative DNS servers (ADNS) for over 10M domains and many CDNs and find some ADNS, serving some popular websites and a number of CDNs, that do not support DNS-over-TCP. These ADNS would deny service to (RFC-compliant) resolvers that choose to switch to TCP-only interactions. Third, we study the TCP connection reuse behavior of DNS actorsmore »and describe a race condition in TCP connection reuse by DNS actors that may become a significant issue should DNS-over-TCP and other TCP-based DNS protocols, such as DNS-over-TLS, become widely used.« less
  4. The Domain Name System (DNS) leverages nearly 1K dis- tributed servers to provide information about the root of the Internet’s namespace. The large size and broad distri- bution of the root nameserver infrastructure has a number of benefits, including providing robustness, low delays to topologically close root servers and a way to cope with the immense torrent of queries destined for the root nameservers. While the root nameserver service operates well, it repre- sents a large community investment. Due to this large cost, in this paper we take the position that DNS’ root nameserv- ers should be eliminated. Instead, recursive resolvers should use a local copy of the root zone file instead of consulting root nameservers. This paper considers the pros and cons of this alternate approach.
  5. The Domain Name System (DNS) is a hierarchical, decentralized, and distributed database. A key mechanism that enables the DNS to be hierarchical and distributed is delegation of responsibility from parent to child zones—typically managed by different entities. RFC1034 states that authoritative nameserver (NS) records at both parent and child should be "consistent and remain so", but we find inconsistencies for over 13M second-level domains. We classify the type of inconsistencies we observe, and the behavior of resolvers in the face of such inconsistencies, using RIPE Atlas to probe our experimental domain configured for different scenarios. Our results underline the risk such inconsistencies pose to the availability of misconfigured domains.