skip to main content


Title: NoMoAds: Effective and Efficient Cross-App Mobile Ad-Blocking (The Andreas Pfitzmann Best Student Paper Award)
Although advertising is a popular strategy for mobile app monetization, it is often desirable to block ads in order to improve usability, performance, privacy, and security. In this paper, we propose NoMoAds to block ads served by any app on a mobile device. NoMoAds leverages the network interface as a universal vantage point: it can intercept, inspect, and block outgoing packets from all apps on a mobile device. NoMoAds extracts features from packet headers and/or payload to train machine learning classifiers for detecting ad requests. To evaluate NoMoAds, we collect and label a new dataset using both EasyList and manually created rules. We show that NoMoAds is effective: it achieves an F-score of up to 97.8% and performs well when deployed in the wild. Furthermore, NoMoAds is able to detect mobile ads that are missed by EasyList (more than one-third of ads in our dataset). We also show that NoMoAds is efficient: it performs ad classification on a per-packet basis in real-time. To the best of our knowledge, NoMoAds is the first mobile ad-blocker to effectively and efficiently block ads served across all apps using a machine learning approach.  more » « less
Award ID(s):
1649372
NSF-PAR ID:
10077006
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of the Privacy Enhancing Technologies Symposium (PETS)
Volume:
2018
Issue:
4
Page Range / eLocation ID:
125–140
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Today’s mobile apps employ third-party advertising and tracking (A&T) libraries, which may pose a threat to privacy. State-of-the-art detects and blocks outgoing A&T HTTP/S requests by using manually curated filter lists (e.g. EasyList), and recently, using machine learning approaches. The major bottleneck of both filter lists and classifiers is that they rely on experts and the community to inspect traffic and manually create filter list rules that can then be used to block traffic or label ground truth datasets. We propose NoMoATS – a system that removes this bottleneck by reducing the daunting task of manually creating filter rules, to the much easier and scalable task of labeling A&T libraries. Our system leverages stack trace analysis to automatically label which network requests are generated by A&T libraries. Using NoMoATS, we collect and label a new mobile traffic dataset. We use this dataset to train decision tree classifiers, which can be applied in real-time on the mobile device and achieve an average F-score of 93%. We show that both our automatic labeling and our classifiers discover thousands of requests destined to hundreds of different hosts, previously undetected by popular filter lists. To the best of our knowledge, our system is the first to (1) automatically label which mobile network requests are engaged in A&T, while requiring to only manually label libraries to their purpose and (2) apply on-device machine learning classifiers that operate at the granularity of URLs, can inspect connections across all apps, and detect not only ads, but also tracking. 
    more » « less
  2. null (Ed.)
    Machine learning-based malware detection systems are often vulnerable to evasion attacks, in which a malware developer manipulates their malicious software such that it is misclassified as benign. Such software hides some properties of the real class or adopts some properties of a different class by applying small perturbations. A special case of evasive malware hides by repackaging a bonafide benign mobile app to contain malware in addition to the original functionality of the app, thus retaining most of the benign properties of the original app. We present a novel malware detection system based on metamorphic testing principles that can detect such benign-seeming malware apps. We apply metamorphic testing to the feature representation of the mobile app, rather than to the app itself. That is, the source input is the original feature vector for the app and the derived input is that vector with selected features removed. If the app was originally classified benign, and is indeed benign, the output for the source and derived inputs should be the same class, i.e., benign, but if they differ, then the app is exposed as (likely) malware. Malware apps originally classified as malware should retain that classification, since only features prevalent in benign apps are removed. This approach enables the machine learning model to classify repackaged malware with reasonably few false negatives and false positives. Our training pipeline is simpler than many existing ML-based malware detection methods, as the network is trained end-to-end to jointly learn appropriate features and to perform classification. We pre-trained our classifier model on 3 million apps collected from the widely-used AndroZoo dataset. 1 We perform an extensive study on other publicly available datasets to show our approach’s effectiveness in detecting repackaged malware with more than 94% accuracy, 0.98 precision, 0.95 recall, and 0.96 F1 score. 
    more » « less
  3. Online trackers are invasive as they track our digital footprints, many of which are sensitive in nature, and when aggregated over time, they can help infer intricate details about our lifestyles and habits. Although much research has been conducted to understand the effectiveness of existing countermeasures for the desktop platform, little is known about how mobile browsers have evolved to handle online trackers. With mobile devices now generating more web traffic than their desktop counterparts, we fill this research gap through a large-scale comparative analysis of mobile web browsers. We crawl 10K valid websites from the Tranco list on real mobile devices. Our data collection process covers both popular generic browsers (e.g., Chrome, Firefox, and Safari) as well as privacy-focused browsers (e.g., Brave, Duck Duck Go, and Firefox-Focus). We use dynamic analysis of runtime execution traces and static analysis of source codes to highlight the tracking behavior of invasive fingerprinters. We also find evidence of tailored content being served to different browsers. In particular, we note that Firefox Focus sees altered script code, whereas Brave and Duck Duck Go have highly similar content. To test the privacy protection of browsers, we measure the responses of each browser in blocking trackers and advertisers and note the strengths and weaknesses of privacy browsers. To establish ground truth, we use well-known block lists, including EasyList, EasyPrivacy, Disconnect and WhoTracksMe and find that Brave generally blocks the highest number of content that should be blocked as per these lists. Focus performs better against social trackers, and Duck Duck Go restricts third-party trackers that perform email-based tracking. 
    more » « less
  4. null (Ed.)
    Targeted advertisement is prevalent on the Web. Many privacy-enhancing tools have been developed to thwart targeted advertisement. Adblock Plus is one such popular tool, used by millions of users on a daily basis, to block unwanted ads and trackers. Adblock Plus uses EasyList and EasyPrivacy, the most prominent and widely used open-source filters, to block unwanted web contents. However, Adblock Plus, by default, also enables an exception list to unblock web requests that comply with specific guidelines defined by the Acceptable Ads Committee. Any publisher can enroll into the Acceptable Ads initiative to request the unblocking of web contents. Adblock Plus in return charges a licensing fee from large entities, who gain a significant amount of ad impressions per month due to participation in the Acceptable Ads initiative. However, the privacy implications of the default inclusion of the exception list has not been well studied, especially as it can unblock not only ads, but also trackers (e.g., unblocking contents otherwise blocked by EasyPrivacy). In this paper, we take a data-driven approach, where we collect historical updates made to Adblock Plus's exception list and real-world web traffic by visiting the top 10k websites listed by Tranco. Using such data we analyze not only how the exception list has evolved over the years in terms of both contents unblocked and partners/entities enrolled into the Acceptable Ads initiative, but also the privacy implications of enabling the exception list by default. We found that Google not only unblocks the most number of unique domains, but is also unblocked by the most number of unique partners. From our traffic analysis, we see that of the 42,210 Google bound web requests, originally blocked by EasyPrivacy, around 80% of such requests are unblocked by the exception list. More worryingly, many of the requests enable 1-by-1 tracking pixel images. We, therefore, question exception rules that negate EasyPrivacy filtering rules by default and advocate for a better vetting process. 
    more » « less
  5. Abstract

    Small‐to‐medium businesses are always seeking affordable ways to advertise their products and services securely. With the emergence of mobile technology, it is possible than ever to implement innovative Location‐Based Advertising (LBS) systems using smartphones that preserve the privacy of mobile users. In this paper, we present a prototype implementation of such systems by developing a distributed privacy‐preserving system, which has parts executing on smartphones as a mobile app, as well as a web‐based application hosted on the cloud. The mobile app leverages Google Maps libraries to enhance the user experience in using the app. Mobile users can use the app to commute to their daily destinations while viewing relevant ads such as job openings in their neighborhood, discounts on favorite meals, etc. We developed a client‐server privacy architecture that anonymizes the mobile user trajectories using a bounded perturbation strategy. A multi‐modal sensing approach is proposed for modeling the context switching of the developed LBS system, which we represent as a Finite State Machine model. The multi‐modal sensing approach can reduce the power consumed by mobile devices by automatically detecting sensing mode changes to avoid unnecessary sensing. The developed LBS system is organized into two parts: the business side and the user side. First, the business side allows business owners to create new ads by providing the ad details, Geo‐location, photos, and any other instructions. Second, the user side allows mobile users to navigate through the map to see ads while walking, driving, bicycling, or quietly sitting in their offices. Experimental results are presented to demonstrate the scalability and performance of the mobile side. Our experimental evaluation demonstrates that the mobile app incurs low processing overhead and consequently has a small energy footprint.

     
    more » « less