skip to main content

Title: NoMoAds: Effective and Efficient Cross-App Mobile Ad-Blocking (The Andreas Pfitzmann Best Student Paper Award)
Although advertising is a popular strategy for mobile app monetization, it is often desirable to block ads in order to improve usability, performance, privacy, and security. In this paper, we propose NoMoAds to block ads served by any app on a mobile device. NoMoAds leverages the network interface as a universal vantage point: it can intercept, inspect, and block outgoing packets from all apps on a mobile device. NoMoAds extracts features from packet headers and/or payload to train machine learning classifiers for detecting ad requests. To evaluate NoMoAds, we collect and label a new dataset using both EasyList and manually created rules. We show that NoMoAds is effective: it achieves an F-score of up to 97.8% and performs well when deployed in the wild. Furthermore, NoMoAds is able to detect mobile ads that are missed by EasyList (more than one-third of ads in our dataset). We also show that NoMoAds is efficient: it performs ad classification on a per-packet basis in real-time. To the best of our knowledge, NoMoAds is the first mobile ad-blocker to effectively and efficiently block ads served across all apps using a machine learning approach.
; ;
Award ID(s):
Publication Date:
Journal Name:
Proceedings of the Privacy Enhancing Technologies Symposium (PETS)
Page Range or eLocation-ID:
Sponsoring Org:
National Science Foundation
More Like this
  1. Today’s mobile apps employ third-party advertising and tracking (A&T) libraries, which may pose a threat to privacy. State-of-the-art detects and blocks outgoing A&T HTTP/S requests by using manually curated filter lists (e.g. EasyList), and recently, using machine learning approaches. The major bottleneck of both filter lists and classifiers is that they rely on experts and the community to inspect traffic and manually create filter list rules that can then be used to block traffic or label ground truth datasets. We propose NoMoATS – a system that removes this bottleneck by reducing the daunting task of manually creating filter rules, to the much easier and scalable task of labeling A&T libraries. Our system leverages stack trace analysis to automatically label which network requests are generated by A&T libraries. Using NoMoATS, we collect and label a new mobile traffic dataset. We use this dataset to train decision tree classifiers, which can be applied in real-time on the mobile device and achieve an average F-score of 93%. We show that both our automatic labeling and our classifiers discover thousands of requests destined to hundreds of different hosts, previously undetected by popular filter lists. To the best of our knowledge, our system is themore »first to (1) automatically label which mobile network requests are engaged in A&T, while requiring to only manually label libraries to their purpose and (2) apply on-device machine learning classifiers that operate at the granularity of URLs, can inspect connections across all apps, and detect not only ads, but also tracking.« less
  2. Machine learning-based malware detection systems are often vulnerable to evasion attacks, in which a malware developer manipulates their malicious software such that it is misclassified as benign. Such software hides some properties of the real class or adopts some properties of a different class by applying small perturbations. A special case of evasive malware hides by repackaging a bonafide benign mobile app to contain malware in addition to the original functionality of the app, thus retaining most of the benign properties of the original app. We present a novel malware detection system based on metamorphic testing principles that can detect such benign-seeming malware apps. We apply metamorphic testing to the feature representation of the mobile app, rather than to the app itself. That is, the source input is the original feature vector for the app and the derived input is that vector with selected features removed. If the app was originally classified benign, and is indeed benign, the output for the source and derived inputs should be the same class, i.e., benign, but if they differ, then the app is exposed as (likely) malware. Malware apps originally classified as malware should retain that classification, since only features prevalent in benignmore »apps are removed. This approach enables the machine learning model to classify repackaged malware with reasonably few false negatives and false positives. Our training pipeline is simpler than many existing ML-based malware detection methods, as the network is trained end-to-end to jointly learn appropriate features and to perform classification. We pre-trained our classifier model on 3 million apps collected from the widely-used AndroZoo dataset. 1 We perform an extensive study on other publicly available datasets to show our approach’s effectiveness in detecting repackaged malware with more than 94% accuracy, 0.98 precision, 0.95 recall, and 0.96 F1 score.« less
  3. Abstract STUDY QUESTION To what extent does the use of mobile computing apps to track the menstrual cycle and the fertile window influence fecundability among women trying to conceive? SUMMARY ANSWER After adjusting for potential confounders, use of any of several different apps was associated with increased fecundability ranging from 12% to 20% per cycle of attempt. WHAT IS KNOWN ALREADY Many women are using mobile computing apps to track their menstrual cycle and the fertile window, including while trying to conceive. STUDY DESIGN, SIZE, DURATION The Pregnancy Study Online (PRESTO) is a North American prospective internet-based cohort of women who are aged 21–45 years, trying to conceive and not using contraception or fertility treatment at baseline. PARTICIPANTS/MATERIALS, SETTING, METHODS We restricted the analysis to 8363 women trying to conceive for no more than 6 months at baseline; the women were recruited from June 2013 through May 2019. Women completed questionnaires at baseline and every 2 months for up to 1 year. The main outcome was fecundability, i.e. the per-cycle probability of conception, which we assessed using self-reported data on time to pregnancy (confirmed by positive home pregnancy test) in menstrual cycles. On the baseline and follow-up questionnaires, women reportedmore »whether they used mobile computing apps to track their menstrual cycles (‘cycle apps’) and, if so, which one(s). We estimated fecundability ratios (FRs) for the use of cycle apps, adjusted for female age, race/ethnicity, prior pregnancy, BMI, income, current smoking, education, partner education, caffeine intake, use of hormonal contraceptives as the last method of contraception, hours of sleep per night, cycle regularity, use of prenatal supplements, marital status, intercourse frequency and history of subfertility. We also examined the impact of concurrent use of fertility indicators: basal body temperature, cervical fluid, cervix position and/or urine LH. MAIN RESULTS AND THE ROLE OF CHANCE Among 8363 women, 6077 (72.7%) were using one or more cycle apps at baseline. A total of 122 separate apps were reported by women. We designated five of these apps before analysis as more likely to be effective (Clue, Fertility Friend, Glow, Kindara, Ovia; hereafter referred to as ‘selected apps’). The use of any app at baseline was associated with 20% increased fecundability, with little difference between selected apps versus other apps (selected apps FR (95% CI): 1.20 (1.13, 1.28); all other apps 1.21 (1.13, 1.30)). In time-varying analyses, cycle app use was associated with 12–15% increased fecundability (selected apps FR (95% CI): 1.12 (1.04, 1.21); all other apps 1.15 (1.07, 1.24)). When apps were used at baseline with one or more fertility indicators, there was higher fecundability than without fertility indicators (selected apps with indicators FR (95% CI): 1.23 (1.14, 1.34) versus without indicators 1.17 (1.05, 1.30); other apps with indicators 1.30 (1.19, 1.43) versus without indicators 1.16 (1.06, 1.27)). In time-varying analyses, results were similar when stratified by time trying at study entry (<3 vs. 3–6 cycles) or cycle regularity. For use of the selected apps, we observed higher fecundability among women with a history of subfertility: FR 1.33 (1.05–1.67). LIMITATIONS, REASONS FOR CAUTION Neither regularity nor intensity of app use was ascertained. The prospective time-varying assessment of app use was based on questionnaires completed every 2 months, which would not capture more frequent changes. Intercourse frequency was also reported retrospectively and we do not have data on timing of intercourse relative to the fertile window. Although we controlled for a wide range of covariates, we cannot exclude the possibility of residual confounding (e.g. choosing to use an app in this observational study may be a marker for unmeasured health habits promoting fecundability). Half of the women in the study received a free premium subscription for one of the apps (Fertility Friend), which may have increased the overall prevalence of app use in the time-varying analyses, but would not affect app use at baseline. Most women in the study were college educated, which may limit application of results to other populations. WIDER IMPLICATIONS OF THE FINDINGS Use of a cycle app, especially in combination with observation of one or more fertility indicators (basal body temperature, cervical fluid, cervix position and/or urine LH), may increase fecundability (per-cycle pregnancy probability) by about 12–20% for couples trying to conceive. We did not find consistent evidence of improved fecundability resulting from use of one specific app over another. STUDY FUNDING/COMPETING INTEREST(S) This research was supported by grants, R21HD072326 and R01HD086742, from the Eunice Kennedy Shriver National Institute of Child Health and Human Development, USA. In the last 3 years, Dr L.A.W. has served as a fibroid consultant for Dr L.A.W. has also received in-kind donations from Sandstone Diagnostics, Swiss Precision Diagnostics, and for primary data collection and participant incentives in the PRESTO cohort. Dr J.B.S. reports personal fees from Swiss Precision Diagnostics, outside the submitted work. The remaining authors have nothing to declare. TRIAL REGISTRATION NUMBER N/A.« less
  4. It is commonly assumed that the availability of “free” mobile apps comes at the cost of consumer privacy, and that paying for apps could offer consumers protection from behavioral advertising and long-term tracking. This work empirically evaluates the validity of this assumption by investigating the degree to which “free” apps and their paid premium versions differ in their bundled code, their declared permissions, and their data collection behaviors and privacy practices. We compare pairs of free and paid apps using a combination of static and dynamic analysis. We also examine the differences in the privacy policies within pairs. We rely on static analysis to determine the requested permissions and third-party SDKs in each app; we use dynamic analysis to detect sensitive data collected by remote services at the network traffic level; and we compare text versions of privacy policies to identify differences in the disclosure of data collection behaviors. In total, we analyzed 1,505 pairs of free Android apps and their paid counterparts, with free apps randomly drawn from the Google Play Store’s category-level top charts. Our results show that over our corpus of free and paid pairs, there is no clear evidence that paying for an app will guaranteemore »protection from extensive data collection. Specifically, 48% of the paid versions reused all of the same third-party libraries as their free versions, while 56% of the paid versions inherited all of the free versions’ Android permissions to access sensitive device resources (when considering free apps that include at least one third-party library and request at least one Android permission). Additionally, our dynamic analysis reveals that 38% of the paid apps exhibit all of the same data collection and transmission behaviors as their free counterparts. Our exploration of privacy policies reveals that only 45% of the pairs provide a privacy policy of some sort, and less than 1% of the pairs overall have policies that differ between free and paid versions.« less
  5. Today there is no effective support for device-wide question answer- ing on mobile devices. State-of-the-art QA models are deep learning behemoths designed for the cloud which run extremely slow and require more memory than available on phones. We present DeQA, a suite of latency- and memory- optimizations that adapts existing QA systems to run completely locally on mobile phones. Specifi- cally, we design two latency optimizations that (1) stops processing documents if further processing cannot improve answer quality, and (2) identifies computation that does not depend on the ques- tion and moves it offline. These optimizations do not depend on the QA model internals and can be applied to several existing QA models. DeQA also implements a set of memory optimizations by (i) loading partial indexes in memory, (ii) working with smaller units of data, and (iii) replacing in-memory lookups with a key-value database. We use DeQA to port three state-of-the-art QA systems to the mobile device and evaluate over three datasets. The first is a large scale SQuAD dataset defined over Wikipedia collection. We also create two on-device QA datasets, one over a publicly available email data collection and the other using a cross-app data collection we obtain frommore »two users. Our evaluations show that DeQA can run QA models with only a few hundred MBs of memory and provides at least 13x speedup on average on the mobile phone across all three datasets.« less