skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Thursday, February 13 until 2:00 AM ET on Friday, February 14 due to maintenance. We apologize for the inconvenience.


Title: Analyzing Third Party Service Dependencies in Modern Web Services: Have We Learned from the Mirai-Dyn Incident?
Many websites rely on third parties for services (e.g., DNS, CDN, etc.). However, it also exposes them to shared risks from attacks (e.g., Mirai DDoS attack [24]) or cascading failures (e.g., GlobalSign revocation error [21]). Motivated by such incidents, we analyze the prevalence and impact of third-party dependencies, focusing on three critical infrastructure services: DNS, CDN, and certificate revocation checking by CA. We analyze both direct (e.g., Twitter uses Dyn) and indirect (e.g., Netflix uses Symantec as CA which uses Verisign for DNS) dependencies. We also take two snapshots in 2016 and 2020 to understand how the dependencies evolved. Our key findings are: (1) 89% of the Alexa top-100K websites critically depend on third-party DNS, CDN, or CA providers i.e., if these providers go down, these websites could suffer service disruption; (2) the use of third-party services is concentrated, and the top-3 providers of CDN, DNS, or CA services can affect 50%-70% of the top-100K websites; (3) indirect dependencies amplify the impact of popular CDN and DNS providers by up to 25X; and (4) some third-party dependencies and concentration increased marginally between 2016 to 2020. Based on our findings, we derive key impli- cations for different stakeholders in the web ecosystem.  more » « less
Award ID(s):
1801472 1564009
PAR ID:
10316481
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Internet Measurement Conference (IMC)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Third-party dependencies expose websites to shared risks and cascading failures. The dependencies impact African websites as well e.g., Afrihost outage in 2022 [15]. While the prevalence of third-party dependencies has been studied for globally popular websites, Africa is largely underrepresented in those studies. Hence, this work analyzes the prevalence of third-party infrastructure dependencies in Africa-centric websites from 4 African vantage points. We consider websites that fall into one of the four categories: Africa-visited (popular in Africa) Africa-hosted (sites hosted in Africa), Africa-dominant (sites targeted towards users in Africa), and Africa-operated (websites operated in Africa). Our key findings are: 1) 93% of the Africa-visited websites critically depend on a third-party DNS, CDN, or CA. In perspective, US-visited websites are up to 25% less critically dependent. 2) 97% of Africa-dominant, 96% of Africa-hosted, and 95% of Africa-operated websites are critically dependent on a third-party DNS, CDN, or CA provider. 3) The use of third-party services is concentrated where only 3 providers can affect 60% of the Africa-centric websites. Our findings have key implications for the present usage and recommendations for the future evolution of the Internet in Africa. 
    more » « less
  2. Abstract The rise of ad-blockers is viewed as an economic threat by online publishers who primarily rely on online advertising to monetize their services. To address this threat, publishers have started to retaliate by employing anti ad-blockers , which scout for ad-block users and react to them by pushing users to whitelist the website or disable ad-blockers altogether. The clash between ad-blockers and anti ad-blockers has resulted in a new arms race on the Web. In this paper, we present an automated machine learning based approach to identify anti ad-blockers that detect and react to ad-block users. The approach is promising with precision of 94.8% and recall of 93.1%. Our automated approach allows us to conduct a large-scale measurement study of anti ad-blockers on Alexa top-100K websites. We identify 686 websites that make visible changes to their page content in response to ad-block detection. We characterize the spectrum of different strategies used by anti ad-blockers. We find that a majority of publishers use fairly simple first-party anti ad-block scripts. However, we also note the use of third-party anti ad-block services that use more sophisticated tactics to detect and respond to ad-blockers. 
    more » « less
  3. In the United States, sensitive health information is protected under the Health Insurance Portability and Accountability Act (HIPAA). This act limits the disclosure of Protected Health Information (PHI) without the patient’s consent or knowledge. However, as medical care becomes web-integrated, many providers have chosen to use third-party web trackers for measurement and marketing purposes. This presents a security concern: third-party JavaScript requested by an online healthcare system can read the website’s contents, and ensuring PHI is not unintentionally or maliciously leaked becomes difficult. In this paper, we investigate health information breaches in online medical records, focusing on 459 online patient portals and 4 telehealth websites. We find 14% of patient portals include Google Analytics, which reveals (at a minimum) the fact that the user visited the health provider website, while 5 portals and 4 telehealth websites con- tained JavaScript-based services disclosing PHI, including medications and lab results, to third parties. The most significant PHI breaches were on behalf of Google and Facebook trackers. In the latter case, an estimated 4.5 million site visitors per month were potentially exposed to leaks of personal information (names, phone numbers) and medical information (test results, medications). We notified healthcare providers of the PHI breaches and found only 15.7% took action to correct leaks. Healthcare operators lacked the technical expertise to identify PHI breaches caused by third-party trackers. After notifying Epic, a healthcare portal vendor, of the PHI leaks, we received a prompt response and observed extensive mitigation across providers, suggesting vendor notification is an effective intervention against PHI disclosures. 
    more » « less
  4. Abstract Smart DNS (SDNS) services advertise access to geofenced content (typically, video streaming sites such as Netflix or Hulu) that is normally inaccessible unless the client is within a prescribed geographic region. SDNS is simple to use and involves no software installation. Instead, it requires only that users modify their DNS settings to point to an SDNS resolver. The SDNS resolver “smartly” identifies geofenced domains and, in lieu of their proper DNS resolutions, returns IP addresses of proxy servers located within the geofence. These servers then transparently proxy traffic between the users and their intended destinations, allowing for the bypass of these geographic restrictions. This paper presents the first academic study of SDNS services. We identify a number of serious and pervasive privacy vulnerabilities that expose information about the users of these systems. These include architectural weaknesses that enable content providers to identify which requesting clients use SDNS. Worse, we identify flaws in the design of some SDNS services that allow any arbitrary third party to enumerate these services’ users (by IP address), even if said users are currently offline. We present mitigation strategies to these attacks that have been adopted by at least one SDNS provider in response to our findings. 
    more » « less
  5. Hohlfeld, O ; Moura, G ; Pelsser, C. (Ed.)
    While the DNS protocol encompasses both UDP and TCP as its underlying transport, UDP is commonly used in practice. At the same time, increasingly large DNS responses and concerns over amplification denial of service attacks have heightened interest in conducting DNS interactions over TCP. This paper surveys the support for DNS-over-TCP in the deployed DNS infrastructure from several angles. First, we assess resolvers responsible for over 66.2% of the external DNS queries that arrive at a major content delivery network (CDN). We find that 2.7% to 4.8% of the resolvers, contributing around 1.1% to 4.4% of all queries arriving at the CDN from the resolvers we study, do not properly fallback to TCP when instructed by authoritative DNS servers. Should a content provider decide to employ TCP-fallback as the means of switching to DNS-over-TCP, it faces the corresponding loss of its customers. Second, we assess authoritative DNS servers (ADNS) for over 10M domains and many CDNs and find some ADNS, serving some popular websites and a number of CDNs, that do not support DNS-over-TCP. These ADNS would deny service to (RFC-compliant) resolvers that choose to switch to TCP-only interactions. Third, we study the TCP connection reuse behavior of DNS actors and describe a race condition in TCP connection reuse by DNS actors that may become a significant issue should DNS-over-TCP and other TCP-based DNS protocols, such as DNS-over-TLS, become widely used. 
    more » « less