skip to main content


Title: AutoFR: Automated Filter Rule Generation for Adblocking
Adblocking relies on filter lists, which are manually curated and maintained by a community of filter list authors. Filter list curation is a laborious process that does not scale well to a large number of sites or over time. In this paper, we introduce AutoFR, a reinforcement learning framework to fully automate the process of filter rule creation and evaluation for sites of interest. We design an algorithm based on multi-arm bandits to generate filter rules that block ads while controlling the trade-off between blocking ads and avoiding visual breakage. We test AutoFR on thousands of sites and we show that it is efficient: it takes only a few minutes to generate filter rules for a site of interest. AutoFR is effective: it generates filter rules that can block 86% of the ads, as compared to 87% by EasyList, while achieving comparable visual breakage. Furthermore, AutoFR generates filter rules that generalize well to new sites. We envision that AutoFR can assist the adblocking community in filter rule generation at scale.  more » « less
Award ID(s):
1956393
NSF-PAR ID:
10431366
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of USENIX Security 2023
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Today’s mobile apps employ third-party advertising and tracking (A&T) libraries, which may pose a threat to privacy. State-of-the-art detects and blocks outgoing A&T HTTP/S requests by using manually curated filter lists (e.g. EasyList), and recently, using machine learning approaches. The major bottleneck of both filter lists and classifiers is that they rely on experts and the community to inspect traffic and manually create filter list rules that can then be used to block traffic or label ground truth datasets. We propose NoMoATS – a system that removes this bottleneck by reducing the daunting task of manually creating filter rules, to the much easier and scalable task of labeling A&T libraries. Our system leverages stack trace analysis to automatically label which network requests are generated by A&T libraries. Using NoMoATS, we collect and label a new mobile traffic dataset. We use this dataset to train decision tree classifiers, which can be applied in real-time on the mobile device and achieve an average F-score of 93%. We show that both our automatic labeling and our classifiers discover thousands of requests destined to hundreds of different hosts, previously undetected by popular filter lists. To the best of our knowledge, our system is the first to (1) automatically label which mobile network requests are engaged in A&T, while requiring to only manually label libraries to their purpose and (2) apply on-device machine learning classifiers that operate at the granularity of URLs, can inspect connections across all apps, and detect not only ads, but also tracking. 
    more » « less
  2. null (Ed.)
    The adblocking arms race has escalated over the last few years. An entire new ecosystem of circumvention (CV) services has recently emerged that aims to bypass adblockers by obfuscating site content, making it difficult for adblocking filter lists to distinguish between ads and functional content. In this paper, we investigate recent anti-circumvention efforts by the adblocking community that leverage custom filter lists. In particular, we analyze the anti-circumvention filter list (ACVL), which supports advanced filter rules with enriched syntax and capabilities designed specifically to counter circumvention. We show that keeping ACVL rules up-to-date requires expert list curators to continuously monitor sites known to employ CV services and to discover new such sites in the wild — both tasks require considerable manual effort. To help automate and scale ACVL curation, we develop CV-INSPECTOR, a machine learning approach for automatically detecting adblock circumvention using differential execution analysis. We show that CV-INSPECTOR achieves 93% accuracy in detecting sites that successfully circumvent adblockers. We deploy CV-INSPECTOR on top-20K sites to discover the sites that employ circumvention in the wild.We further apply CV-INSPECTOR to a list of sites that are known to utilize circumvention and are closely monitored by ACVL authors. We demonstrate that CV-INSPECTOR reduces the human labeling effort by 98%, which removes a major bottleneck for ACVL authors. Our work is the first large-scale study of the state of the adblock circumvention arms race, and makes an important step towards automating anti-CV efforts. 
    more » « less
  3. null (Ed.)
    Targeted advertisement is prevalent on the Web. Many privacy-enhancing tools have been developed to thwart targeted advertisement. Adblock Plus is one such popular tool, used by millions of users on a daily basis, to block unwanted ads and trackers. Adblock Plus uses EasyList and EasyPrivacy, the most prominent and widely used open-source filters, to block unwanted web contents. However, Adblock Plus, by default, also enables an exception list to unblock web requests that comply with specific guidelines defined by the Acceptable Ads Committee. Any publisher can enroll into the Acceptable Ads initiative to request the unblocking of web contents. Adblock Plus in return charges a licensing fee from large entities, who gain a significant amount of ad impressions per month due to participation in the Acceptable Ads initiative. However, the privacy implications of the default inclusion of the exception list has not been well studied, especially as it can unblock not only ads, but also trackers (e.g., unblocking contents otherwise blocked by EasyPrivacy). In this paper, we take a data-driven approach, where we collect historical updates made to Adblock Plus's exception list and real-world web traffic by visiting the top 10k websites listed by Tranco. Using such data we analyze not only how the exception list has evolved over the years in terms of both contents unblocked and partners/entities enrolled into the Acceptable Ads initiative, but also the privacy implications of enabling the exception list by default. We found that Google not only unblocks the most number of unique domains, but is also unblocked by the most number of unique partners. From our traffic analysis, we see that of the 42,210 Google bound web requests, originally blocked by EasyPrivacy, around 80% of such requests are unblocked by the exception list. More worryingly, many of the requests enable 1-by-1 tracking pixel images. We, therefore, question exception rules that negate EasyPrivacy filtering rules by default and advocate for a better vetting process. 
    more » « less
  4. To evade their predators, animals must quickly detect potential threats, gauge risk, and mount a response. Putative neural circuits responsible for these tasks have been isolated in laboratory studies. However, it is unclear whether and how these circuits combine to generate the flexible, dynamic sequences of evasion behavior exhibited by wild, freely moving animals. Here, we report that evasion behavior of wild fish on a coral reef is generated through a sequence of well-defined decision rules that convert visual sensory input into behavioral actions. Using an automated system to present visual threat stimuli to fish in situ, we show that individuals initiate escape maneuvers in response to the perceived size and expansion rate of an oncoming threat using a decision rule that matches dynamics of known loom-sensitive neural circuits. After initiating an evasion maneuver, fish adjust their trajectories using a control rule based on visual feedback to steer away from the threat and toward shelter. These decision rules accurately describe evasion behavior of fish from phylogenetically distant families, illustrating the conserved nature of escape decision-making. Our results reveal how the flexible behavioral responses required for survival can emerge from relatively simple, conserved decision-making mechanisms.

     
    more » « less
  5. Abstract

    Emerging theory suggests that the ecosystem‐level consequences of anthropogenic pressures depend on how species will be disassembled from ecological communities (i.e. the disassembly rule). Species loss, however, is not the sole ecological cause of ecosystem function loss: behaviours underpinning ecosystem function can also be disrupted by anthropogenic pressures without detectable declines of component species (‘cryptic function loss’).

    Here, we introduce a novel framework that integrates behavioural responses into community disassembly metrics. We applied this framework to freshwater mussel communities (order Unionida) of the midwestern United States, in which intensive agricultural land use threatens stream biota. We combined a field experiment, meta‐analysis and watershed‐scale population dataset to assess how excessive sediment concentrations, one of the leading drivers of freshwater biodiversity loss, influence community‐level water clearance rates of freshwater mussels via behavioural (changes in mass‐specific clearance rate) and population (changes in population density) responses.

    Our study provided three key insights. First, freshwater mussels exhibited high behavioural sensitivity to increased total suspended solids (TSS) across species (i.e. reduced water clearance rate), whereas population responses were highly species‐specific. Second, the behavioural response to increased TSS causes substantial cryptic function loss under stressful conditions: simulated water clearance rates when behavioural response is included can be less than half that of mussel communities with no behavioural response. Finally, simulations revealed that mussel communities are likely to show rapid but consistent rates of ecosystem function loss irrespective of disassembly rules. The similar rates of function loss are due to the uniform behavioural response to TSS that masks the linkage between population sensitivity of a species and its contribution to ecosystem function.

    Synthesis and applications. Our findings suggest that ignoring behavioural processes may cause non‐negligible underestimation of ecosystem function loss during community disassembly, potentially leading to overly optimistic assessments of ecosystem resilience. Furthermore, unlike species declines or local extinctions, behaviour response tied to function loss may occur concurrently with increasing anthropogenic pressures. Therefore, managers should acknowledge the risk of immediate function loss after human‐induced environmental changes.

     
    more » « less