skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Human in Every APE: Delineating and Evaluating the Human Analysis Systems of Anti-Phishing Entities
We conducted a large-scale evaluation of some popular Anti-Phishing Entities (APEs). As part of this, we submitted arrays of CAPTCHA challenge-laden honey sites to 7 APEs. An analysis of the “click-through rates” during the visits from the APEs showed strong evidence for the presence of formidable human analysis systems in conjunction with automated crawler systems. In summary, we estimate that as many as 10% to 24% of URLs submitted to each of 4 APEs (Google Safe Browsing, Microsoft SmartScreen, Bitdefender and Netcraft) were likely visited by human analysts. In contrast to prior works, these measurements present a very optimistic picture for web security as, for the first time, they show presence of expansive human analysis systems to tackle suspicious URLs that might otherwise be challenging for automated crawlers to analyze. This finding allowed us an opportunity to conduct the first systematic study of the robustness of the human analysis systems of APEs which revealed some glaring weaknesses in them. We saw that all the APEs we studied fall prey to issues such as lack of geolocation and client device diversity exposing their human systems to targeted evasive attacks. Apart from this, we also found a specific weakness across the entire APE ecosystem that enables creation of long-lasting phishing pages targeted exclusively against Android/Chrome devices by capitalizing on discrepancies in web sensor API outputs. We demonstrate this with the help of 10 artificial phishing sites that survived indefinitely despite repeated reporting to all APEs. We suggest mitigations for all these issues. We also conduct an elaborate disclosure process with all affected APEs in an attempt to persuade them to pursue these mitigations.  more » « less
Award ID(s):
2126655
PAR ID:
10358903
Author(s) / Creator(s):
;
Editor(s):
Cavallaro, L.; Gruss, D.; Pellegrino, G.; Giacinto, G.
Date Published:
Journal Name:
International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA)
Page Range / eLocation ID:
156-177
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Phishing websites many a times look-alike to benign websites with the objective being to lure unsuspecting users to visit them. The visits at times may be driven through links in phishing emails, links from web pages as well as web search results. Although the precise motivations behind phishing websites may differ the common denominator lies in the fact that unsuspecting users are mostly required to take some action e.g., clicking on a desired Uniform Resource Locator (URL). To accurately identify phishing websites, the cybersecurity community has relied on a variety of approaches including blacklisting, heuristic techniques as well as content-based approaches among others. The identification techniques are every so often enhanced using an array of methods i.e., honeypots, features recognition, manual reporting, web-crawlers among others. Nevertheless, a number of phishing websites still escape detection either because they are not blacklisted, are too recent or were incorrectly evaluated. It is therefore imperative to enhance solutions that could mitigate phishing websites threats. In this study, the effectiveness of the Bidirectional Encoder Representations from Transformers (BERT) is investigated as a possible tool for detecting phishing URLs. The experimental results detail that the BERT transformer model achieves acceptable prediction results without requiring advanced URLs feature selection techniques or the involvement of a domain specialist. 
    more » « less
  2. Phishing is a ubiquitous and increasingly sophisticated online threat. To evade mitigations, phishers try to ""cloak"" malicious content from defenders to delay their appearance on blacklists, while still presenting the phishing payload to victims. This cat-and-mouse game is variable and fast-moving, with many distinct cloaking methods---we construct a dataset identifying 2,933 real-world phishing kits that implement cloaking mechanisms. These kits use information from the host, browser, and HTTP request to classify traffic as either anti-phishing entity or potential victim and change their behavior accordingly. In this work we present SPARTACUS, a technique that subverts the phishing status quo by disguising user traffic as anti-phishing entities. These intentional false positives trigger cloaking behavior in phishing kits, thus hiding the malicious payload and protecting the user without disrupting benign sites. To evaluate the effectiveness of this approach, we deployed SPARTACUS as a browser extension from November 2020 to July 2021. During that time, SPARTACUS browsers visited 160,728 reported phishing URLs in the wild. Of these, SPARTACUS protected against 132,274 sites (82.3%). The phishing kits which showed malicious content to SPARTACUS typically did so due to ineffective cloaking---the majority (98.4%) of the remainder were detected by conventional anti-phishing systems such as Google Safe Browsing or VirusTotal, and would be blacklisted regardless. We further evaluate SPARTACUS against benign websites sampled from the Alexa Top One Million List for impacts on latency, accessibility, layout, and CPU overhead, finding minimal performance penalties and no loss in functionality. 
    more » « less
  3. null (Ed.)
    Phishing is a serious challenge that remains largely unsolved despite the efforts of many researchers. In this paper, we present datasets and tools to help phishing researchers. First, we describe our efforts on creating high quality, diverse and representative email and URL/website datasets for phishing and making them publicly available. Second, we describe PhishBench, a benchmarking framework, which automates the extraction of more than 200 features, implements more than 30 classifiers, and 12 evaluation metrics, for detection of phishing emails, websites and URLs. Using PhishBench, the research community can easily run their models and benchmark their work against the work of others, who have used common dataset sources for emails (Nazario, SpamAssassin, WikiLeaks, etc.) and URLs (PhishTank, APWG, Alexa, etc.). 
    more » « less
  4. Phishing websites remain a persistent security threat. Thus far, machine learning approaches appear to have the best potential as defenses. But, there are two main concerns with existing machine learning approaches for phishing detection. The first is the large number of training features used and the lack of validating arguments for these feature choices. The second concern is the type of datasets used in the literature that are inadvertently biased with respect to the features based on the website URL or content. To address these concerns, we put forward the intuition that the domain name of phishing websites is the tell-tale sign of phishing and holds the key to successful phishing detection. Accordingly, we design features that model the relationships, visual as well as statistical, of the domain name to the key elements of a phishing website, which are used to snare the end-users. The main value of our feature design is that, to bypass detection, an attacker will find it very difficult to tamper with the visual content of the phishing website without arousing the suspicion of the end user. Our feature set ensures that there is minimal or no bias with respect to a dataset. Our learning model trains with only seven features and achieves a true positive rate of 98% and a classification accuracy of 97%, on sample dataset. Compared to the state-of-the-art work, our per data instance classification is 4 times faster for legitimate websites and 10 times faster for phishing websites. Importantly, we demonstrate the shortcomings of using features based on URLs as they are likely to be biased towards specific datasets. We show the robustness of our learning algorithm by testing on unknown live phishing URLs and achieve a high detection accuracy of 99.7%. 
    more » « less
  5. The advanced capabilities of Large Language Models (LLMs) have made them invaluable across various applications, from conversational agents and content creation to data analysis, research, and innovation. However, their effectiveness and accessibility also render them susceptible to abuse for generating malicious content, including phishing attacks. This study explores the potential of using four popular commercially available LLMs, i.e., ChatGPT (GPT 3.5 Turbo), GPT 4, Claude, and Bard, to generate functional phishing attacks using a series of malicious prompts. We discover that these LLMs can generate both phishing websites and emails that can convincingly imitate well-known brands and also deploy a range of evasive tactics that are used to elude detection mechanisms employed by anti-phishing systems. These attacks can be generated using unmodified or "vanilla" versions of these LLMs without requiring any prior adversarial exploits such as jailbreaking. We evaluate the performance of the LLMs towards generating these attacks and find that they can also be utilized to create malicious prompts that, in turn, can be fed back to the model to generate phishing scams - thus massively reducing the prompt-engineering effort required by attackers to scale these threats. As a countermeasure, we build a BERT-based automated detection tool that can be used for the early detection of malicious prompts to prevent LLMs from generating phishing content. Our model is transferable across all four commercial LLMs, attaining an average accuracy of 96% for phishing website prompts and 94% for phishing email prompts. We also disclose the vulnerabilities to the concerned LLMs, with Google acknowledging it as a severe issue. Our detection model is available for use at Hugging Face, as well as a ChatGPT Actions plugin. 
    more » « less