Diverse datasets and a customizable benchmarking framework for phishing

Zeng, V.

doi:10.1145/3375708.3380313

Citation Details

Diverse datasets and a customizable benchmarking framework for phishing

Phishing is a serious challenge that remains largely unsolved despite the efforts of many researchers. In this paper, we present datasets and tools to help phishing researchers. First, we describe our efforts on creating high quality, diverse and representative email and URL/website datasets for phishing and making them publicly available. Second, we describe PhishBench, a benchmarking framework, which automates the extraction of more than 200 features, implements more than 30 classifiers, and 12 evaluation metrics, for detection of phishing emails, websites and URLs. Using PhishBench, the research community can easily run their models and benchmark their work against the work of others, who have used common dataset sources for emails (Nazario, SpamAssassin, WikiLeaks, etc.) and URLs (PhishTank, APWG, Alexa, etc.). more »

Award ID(s):: 1659755

PAR ID:: 10224700

Author(s) / Creator(s):: Zeng, V.

Date Published:: 2020-03-01

Journal Name:: Proceedings of the Sixth International Workshop on Security and Privacy Analytics

Page Range / eLocation ID:: 35-41

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3375708.3380313

More Like this