skip to main content


This content will become publicly available on July 1, 2024

Title: BFTDETECTOR: Automatic Detection of Business Flow Tampering for Digital Content Service
Digital content services provide users with a wide range of content, such as news, articles, or movies, while monetizing their content through various business models and promotional methods. Unfortunately, poorly designed or unpro- tected business logic can be circumvented by malicious users, which is known as business flow tampering. Such flaws can severely harm the businesses of digital content service providers. In this paper, we propose an automated approach that discov- ers business flow tampering flaws. Our technique automatically runs a web service to cover different business flows (e.g., a news website with vs. without a subscription paywall) to collect execution traces. We perform differential analysis on the execution traces to identify divergence points that determine how the business flow begins to differ, and then we test to see if the divergence points can be tampered with. We assess our approach against 352 real-world digital content service providers and discover 315 flaws from 204 websites, including TIME, Fortune, and Forbes. Our evaluation result shows that our technique successfully identifies these flaws with low false-positive and false- negative rates of 0.49% and 1.44%, respectively.  more » « less
Award ID(s):
2145616 1908021 1916499
NSF-PAR ID:
10428685
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the International Conference on Software Engineering
ISSN:
1819-3781
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. To improve processor performance, computer architects have adopted such acceleration techniques as speculative execution and caching. However, researchers have recently discovered that this approach implies inherent security flaws, as exploited by Meltdown and Spectre. Attacks targeting these vulnerabilities can leak protected data through side channels such as data cache timing by exploiting mis-speculated executions. The flaws can be catastrophic because they are fundamental and widespread and they affect many modern processors. Mitigating the effect of Meltdown is relatively straightforward in that it entails a software-based fix which has already been deployed by major OS vendors. However, to this day, there is no effective mitigation to Spectre. Fixing the problem may require a redesign of the architecture for conditional execution in future processors. In addition, a Spectre attack is hard to detect using traditional software-based antivirus techniques because it does not leave traces in traditional log files. In this paper, we proposed to monitor microarchitectural events such as cache misses, branch mispredictions from existing CPU performance counters to detect Spectre during attack runtime. Our detector was able to achieve 0% false negatives with less than 1% false positives using various machine learning classifiers with a reasonable performance overhead. 
    more » « less
  2. Interactive web-based applications play an important role for both service providers and consumers. However, web applications tend to be complex, produce high-volume data, and are often ripe for attack. Attack analysis and remediation are complicated by adversary obfuscation and the difficulty in assembling and analyzing logs. In this work, we explore the web application analysis task through log file fusion, distillation, and visualization. Our approach consists of visualizing the logs of web and database traffic with detailed function execution traces. We establish causal links between events and their associated behaviors. We evaluate the effectiveness of this process using data volume reduction statistics, user interaction models, and usage scenarios. Across a set of scenarios, we find that our techniques can filter at least 97.5% of log data and reduce analysis time by 93-96%. 
    more » « less
  3. We report the first wide-scale measurement study of server-side geographic restriction, or geoblocking, a phenomenon in which server operators intentionally deny access to users from particular countries or regions. Many sites practice geoblocking due to legal requirements or other business reasons, but excessive blocking can needlessly deny valuable content and services to entire national populations. To help researchers and policymakers understand this phenomenon, we develop a semi-automated system to detect instances where whole websites were rendered inaccessible due to geoblocking. By focusing on detecting geoblocking capabilities offered by large CDNs and cloud providers, we can reliably distinguish the practice from dynamic anti-abuse mechanisms and network-based censorship. We apply our techniques to test for geoblocking across the Alexa Top 10K sites from thousands of vantage points in 177 countries. We then expand our measurement to a sample of CDN customers in the Alexa Top 1M. We find that geoblocking occurs across a broad set of countries and sites. We observe geoblocking in nearly all countries we study, with Iran, Syria, Sudan, Cuba, and Russia experiencing the highest rates. These countries experience particularly high rates of geoblocking for finance and banking sites, likely as a result of US economic sanctions. We also verify our measurements with data provided by Cloudflare, and find our observations to be accurate. 
    more » « less
  4. Most online mobile services make use of location data to improve customer experience. Mobile users can locate points of interest near them, or can receive recommendations tailored to their whereabouts. However, serious privacy concerns arise when location data is revealed in clear to service providers. Several solutions employ searchable encryption (SE) to evaluate spatial predicates directly on location ciphertexts. While doing so preserves privacy, the performance overhead incurred is high. We focus on a prominent SE technique in the public-key setting -- Hidden Vector Encryption (HVE), and propose a graph embedding technique to encode location data in a way that significantly boost the performance of processing on ciphertexts. We show that finding the optimal encoding is NP-hard, and provide several heuristics that are fast and obtain significant performance gains. Our extensive experimental evaluation on real-life datasets shows that our solutions can improve computational overhead by a factor of two compared to the baseline. 
    more » « less
  5. Most online mobile services make use of location data to improve customer experience. Mobile users can locate points of interest near them, or can receive recommendations tailored to their whereabouts. However, serious privacy concerns arise when location data is revealed in clear to service providers. Several solutions employ searchable encryption (SE) to evaluate spatial predicates directly on location ciphertexts. While doing so preserves privacy, the performance overhead incurred is high. We focus on a prominent SE technique in the public-key setting -- Hidden Vector Encryption (HVE), and propose a graph embedding technique to encode location data in a way that significantly boost the performance of processing on ciphertexts. We show that finding the optimal encoding is NP-hard, and provide several heuristics that are fast and obtain significant performance gains. Our extensive experimental evaluation on real-life datasets shows that our solutions can improve computational overhead by a factor of two compared to the baseline. 
    more » « less