skip to main content


Title: OmniCrawl: Comprehensive Measurement of Web Tracking With Real Desktop and Mobile Browsers
Abstract Over half of all visits to websites now take place in a mobile browser, yet the majority of web privacy studies take the vantage point of desktop browsers, use emulated mobile browsers, or focus on just a single mobile browser instead. In this paper, we present a comprehensive web-tracking measurement study on mobile browsers and privacy-focused mobile browsers. Our study leverages a new web measurement infrastructure, OmniCrawl, which we develop to drive browsers on desktop computers and smartphones located on two continents. We capture web tracking measurements using 42 different non-emulated browsers simultaneously. We find that the third-party advertising and tracking ecosystem of mobile browsers is more similar to that of desktop browsers than previous findings suggested. We study privacy-focused browsers and find their protections differ significantly and in general are less for lower-ranked sites. Our findings also show that common methodological choices made by web measurement studies, such as the use of emulated mobile browsers and Selenium, can lead to website behavior that deviates from what actual users experience.  more » « less
Award ID(s):
1852260 1704542
NSF-PAR ID:
10313206
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings on Privacy Enhancing Technologies
Volume:
2022
Issue:
1
ISSN:
2299-0984
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Online trackers are invasive as they track our digital footprints, many of which are sensitive in nature, and when aggregated over time, they can help infer intricate details about our lifestyles and habits. Although much research has been conducted to understand the effectiveness of existing countermeasures for the desktop platform, little is known about how mobile browsers have evolved to handle online trackers. With mobile devices now generating more web traffic than their desktop counterparts, we fill this research gap through a large-scale comparative analysis of mobile web browsers. We crawl 10K valid websites from the Tranco list on real mobile devices. Our data collection process covers both popular generic browsers (e.g., Chrome, Firefox, and Safari) as well as privacy-focused browsers (e.g., Brave, Duck Duck Go, and Firefox-Focus). We use dynamic analysis of runtime execution traces and static analysis of source codes to highlight the tracking behavior of invasive fingerprinters. We also find evidence of tailored content being served to different browsers. In particular, we note that Firefox Focus sees altered script code, whereas Brave and Duck Duck Go have highly similar content. To test the privacy protection of browsers, we measure the responses of each browser in blocking trackers and advertisers and note the strengths and weaknesses of privacy browsers. To establish ground truth, we use well-known block lists, including EasyList, EasyPrivacy, Disconnect and WhoTracksMe and find that Brave generally blocks the highest number of content that should be blocked as per these lists. Focus performs better against social trackers, and Duck Duck Go restricts third-party trackers that perform email-based tracking. 
    more » « less
  2. Abstract Web measurement is a powerful approach to studying various tracking practices that may compromise the privacy of millions of users. Researchers have built several measurement frameworks and performed a few studies to measure web tracking on the desktop environment. However, little is known about web tracking on the mobile environment, and no tool is readily available for performing a comparative measurement study on mobile and desktop environments. In this work, we built a framework called WTPatrol that allows us and other researchers to perform web tracking measurement on both mobile and desktop environments. Using WTPatrol, we performed the first comparative measurement study of web tracking on 23,310 websites that have both mobile version and desktop version web-pages. We conducted an in-depth comparison of the web tracking practices of those websites between mobile and desktop environments from two perspectives: web tracking based on JavaScript APIs and web tracking based on HTTP cookies. Overall, we found that mobile web tracking has its unique characteristics especially due to mobile-specific trackers, and it has become increasingly as prevalent as desktop web tracking. However, the potential impact of mobile web tracking is more severe than that of desktop web tracking because a user may use a mobile device frequently in different places and be continuously tracked. We further gave some suggestions to web users, developers, and researchers to defend against web tracking. 
    more » « less
  3. Conti, Mauro ; Zhou, Jianying ; Spognardi, Angelo (Ed.)
    Increasingly more mobile browsers are developed to use proxies for traffic compression and censorship circumvention. While these browsers can offer such desirable features, their security implications are, however, not well understood, especially when tangled with TLS in the mix. Apart from vendor-specific proprietary designs, there are mainly 2 models of using proxies with browsers: TLS interception and HTTP tunneling. To understand the current practices employed by proxy-based mobile browsers, we analyze 34 Android browser apps that are representative of the ecosystem, and examine how their deployments are affecting communication security. Though the impacts of TLS interception on security was studied before in other contexts, proxy-based mobile browsers were not considered previously. In addition, the tunneling model requires the browser itself to enforce certain desired security policies (e.g., validating certificates and avoiding the use of weak cipher suites), and it is preferable to have such enforcement matching the security level of conventional desktop browsers. Our evaluation shows that many proxy-based mobile browsers downgrade the overall quality of TLS sessions, by for example allowing old versions of TLS (e.g., SSLv3.0 and TLSv1.0) and accepting weak cryptographic algorithms (e.g., 3DES and RC4) as well as unsatisfactory certificates (e.g., revoked or signed by untrusted CAs), thus exposing their users to potential security and privacy threats. We have reported our findings to the vendors of vulnerable proxy-based browsers and are waiting for their response. 
    more » « less
  4. null (Ed.)
    Monetizing websites and web apps through online advertising is widespread in the web ecosystem, creating a billion-dollar market. This has led to the emergence of a vast network of tertiary ad providers and ad syndication to facilitate this growing market. Nowadays, the online advertising ecosystem forces publishers to integrate ads from these third-party domains. On the one hand, this raises several privacy and security concerns that are actively being studied in recent years. On the other hand, the ability of today's browsers to load dynamic web pages with complex animations and Javascript has also transformed online advertising. This can have a significant impact on webpage performance. The latter is a critical metric for optimization since it ultimately impacts user satisfaction. Unfortunately, there are limited literature studies on understanding the performance impacts of online advertising which we argue is as important as privacy and security. In this paper, we apply an in-depth and first-of-a-kind performance evaluation of web ads. Unlike prior efforts that rely primarily on adblockers, we perform a fine-grained analysis on the web browser's page loading process to demystify the performance cost of web ads. We aim to characterize the cost by every component of an ad, so the publisher, ad syndicate, and advertiser can improve the ad's performance with detailed guidance. For this purpose, we develop a tool, adPerf, for the Chrome browser that classifies page loading workloads into ad-related and main-content at the granularity of browser activities. Our evaluations show that online advertising entails more than 15% of browser page loading workload and approximately 88% of that is spent on JavaScript. On smartphones, this additional cost of ads is 7% lower since mobile pages include fewer and well-optimized ads. We also track the sources and delivery chain of web ads and analyze performance considering the origin of the ad contents. We observe that 2 of the well-known third-party ad domains contribute to 35% of the ads performance cost and surprisingly, top news websites implicitly include unknown third-party ads which in some cases build up to more than 37% of the ads performance cost. 
    more » « less
  5. The transparency and privacy behavior of mobile browsers has remained widely unexplored by the research community. In fact, as opposed to regular Android apps, mobile browsers may present contradicting privacy behaviors. On the one end, they can have access to (and can expose) a unique combination of sensitive user data, from users’ browsing history to permission-protected personally identifiable information (PII) such as unique identifiers and geolocation. However, on the other end, they also are in a unique position to protect users’ privacy by limiting data sharing with other parties by implementing ad-blocking features. In this paper, we perform a comparative and empirical analysis on how hundreds of Android web browsers protect or expose user data during browsing sessions. To this end, we collect the largest dataset of Android browsers to date, from the Google Play Store and four Chinese app stores. Then, we developed a novel analysis pipeline that combines static and dynamic analysis methods to find a wide range of privacy-enhancing (e.g., ad-blocking) and privacy-harming behaviors (e.g., sending browsing histories to third parties, not validating TLS certificates, and exposing PII---including non-resettable identifiers---to third parties) across browsers. We find that various popular apps on both Google Play and Chinese stores have these privacy-harming behaviors, including apps that claim to be privacy-enhancing in their descriptions. Overall, our study not only provides new insights into important yet overlooked considerations for browsers’ adoption and transparency, but also that automatic app analysis systems (e.g., sandboxes) need context-specific analysis to reveal such privacy behaviors. 
    more » « less