NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Evading Deep Learning-Based Malware Detectors via Obfuscation: A Deep Reinforcement Learning Approach

https://doi.org/10.1109/ICDM58522.2023.00019

Etter, Brian; Hu, James Lee; Ebrahimi, Mohammadreza; Li, Weifeng; Li, Xin; Chen, Hsinchun (December 2023, IEEE)

Adversarial Malware Generation (AMG), the generation of adversarial malware variants to strengthen Deep Learning (DL)-based malware detectors has emerged as a crucial tool in the development of proactive cyberdefense. However, the majority of extant works offer subtle perturbations or additions to executable files and do not explore full-file obfuscation. In this study, we show that an open-source encryption tool coupled with a Reinforcement Learning (RL) framework can successfully obfuscate malware to evade state-of-the-art malware detection engines and outperform techniques that use advanced modification methods. Our results show that the proposed method improves the evasion rate from 27%-49% compared to widely-used state-of-the-art reinforcement learning-based methods.
more » « less
Full Text Available
Counteracting Dark Web Text-Based CAPTCHA with Generative Adversarial Learning for Proactive Cyber Threat Intelligence

https://doi.org/10.1145/3505226

Zhang, Ning; Ebrahimi, Mohammadreza; Li, Weifeng; Chen, Hsinchun (June 2022, ACM Transactions on Management Information Systems)

Automated monitoring of dark web (DW) platforms on a large scale is the first step toward developing proactive Cyber Threat Intelligence (CTI). While there are efficient methods for collecting data from the surface web, large-scale dark web data collection is often hindered by anti-crawling measures. In particular, text-based CAPTCHA serves as the most prevalent and prohibiting type of these measures in the dark web. Text-based CAPTCHA identifies and blocks automated crawlers by forcing the user to enter a combination of hard-to-recognize alphanumeric characters. In the dark web, CAPTCHA images are meticulously designed with additional background noise and variable character length to prevent automated CAPTCHA breaking. Existing automated CAPTCHA breaking methods have difficulties in overcoming these dark web challenges. As such, solving dark web text-based CAPTCHA has been relying heavily on human involvement, which is labor-intensive and time-consuming. In this study, we propose a novel framework for automated breaking of dark web CAPTCHA to facilitate dark web data collection. This framework encompasses a novel generative method to recognize dark web text-based CAPTCHA with noisy background and variable character length. To eliminate the need for human involvement, the proposed framework utilizes Generative Adversarial Network (GAN) to counteract dark web background noise and leverages an enhanced character segmentation algorithm to handle CAPTCHA images with variable character length. Our proposed framework, DW-GAN, was systematically evaluated on multiple dark web CAPTCHA testbeds. DW-GAN significantly outperformed the state-of-the-art benchmark methods on all datasets, achieving over 94.4% success rate on a carefully collected real-world dark web dataset. We further conducted a case study on an emergent Dark Net Marketplace (DNM) to demonstrate that DW-GAN eliminated human involvement by automatically solving CAPTCHA challenges with no more than three attempts. Our research enables the CTI community to develop advanced, large-scale dark web monitoring. We make DW-GAN code available to the community as an open-source tool in GitHub.
more » « less
Full Text Available
Key Factors Affecting User Adoption of Open-Access Data Repositories in Intelligence and Security Informatics: An Affordance Perspective

https://doi.org/10.1145/3460823

Wen, Bo; Hu, Paul Jen-Hwa; Ebrahimi, Mohammadreza; Chen, Hsinchun (March 2022, ACM Transactions on Management Information Systems)

Rich, diverse cybersecurity data are critical for efforts by the intelligence and security informatics (ISI) community. Although open-access data repositories (OADRs) provide tremendous benefits for ISI researchers and practitioners, determinants of their adoption remain understudied. Drawing on affordance theory and extant ISI literature, this study proposes a factor model to explain how the essential and unique affordances of an OADR (i.e., relevance, accessibility, and integration) affect individual professionals' intentions to use and collaborate with AZSecure, a major OADR. A survey study designed to test the model and hypotheses reveals that the effects of affordances on ISI professionals' intentions to use and collaborate are mediated by perceived usefulness and ease of use, which then jointly determine their perceived value. This study advances ISI research by specifying three important affordances of OADRs; it also contributes to extant technology adoption literature by scrutinizing and affirming the interplay of essential user acceptance and value perceptions to explain ISI professionals' adoptions of OADRs.
more » « less
Full Text Available
Two-Layer Coded Channel Access With Collision Resolution: Design and Analysis

https://doi.org/10.1109/TWC.2020.3018472

Ebrahimi, Mohammadreza; Lahouti, Farshad; Kostina, Victoria (December 2020, IEEE Transactions on Wireless Communications)
null (Ed.)
Full Text Available
Binary Black-Box Attacks Against Static Malware Detectors with Reinforcement Learning in Discrete Action Spaces

https://doi.org/10.1109/SPW53761.2021.00021

Ebrahimi, Mohammadreza; Pacheco, Jason; Li, Weifeng; Hu, James Lee; Chen, Hsinchun (May 2021, 2021. IEEE S&P Workshop on Deep Learning and Security (DLS),)
null (Ed.)
Full Text Available
A Generative Adversarial Learning Framework for Breaking Text-Based CAPTCHA in the Dark Web

https://doi.org/10.1109/ISI49825.2020.9280537

Zhang, Ning; Ebrahimi, Mohammadreza; Li, Weifeng; Chen, Hsinchun (November 2020, IEEE International Conference on Intelligence and Security Informatics (IEEE ISI 2020).)
null (Ed.)
Full Text Available
Binary Black-box Evasion Attacks Against Deep Learning-based Static Malware Detectors with Adversarial Byte-Level Language Model

https://doi.org/2012.07994

Ebrahimi, Mohammadreza; Zhang, Ning; Hu, James; Raza, Muhammad Taqi; Chen, Hsinchun (January 2021, 2021, AAAI workshop on Robust, Secure and Efficient Machine Learning (RSEML))
null (Ed.)
Full Text Available
Detecting Cyber Threats in Non-English Hacker Forums: An Adversarial Cross-Lingual Knowledge Transfer Approach

https://doi.org/10.1109/SPW50608.2020.00021

Ebrahimi, Mohammadreza; Samtani, Sagar; Chai, Yidong; Chen, Hsinchun (May 2020, 41st IEEE Symposium on Security and Privacy (S&P), 3rd Deep Learning for Security (DLS) Workshop)
null (Ed.)
Full Text Available
Linking Personally Identifiable Information from the Dark Web to the Surface Web: A Deep Entity Resolution Approach

https://doi.org/10.1109/ICDMW51313.2020.00072

Lin, Fangyu; Liu, Yizhi; Ebrahimi, Mohammadreza; Ahmad-Post, Zara; Hu, James Lee; Xin, Jingyu; Samtani, Sagar; Li, Weifeng; Chen, Hsinchun (November 2020, International Conference on Data Mining Workshops (ICDMW))
null (Ed.)
The information privacy of the Internet users has become a major societal concern. The rapid growth of online services increases the risk of unauthorized access to Personally Identifiable Information (PII) of at-risk populations, who are unaware of their PII exposure. To proactively identify online at-risk populations and increase their privacy awareness, it is crucial to conduct a holistic privacy risk assessment across the internet. Current privacy risk assessment studies are limited to a single platform within either the surface web or the dark web. A comprehensive privacy risk assessment requires matching exposed PII on heterogeneous online platforms across the surface web and the dark web. However, due to the incompleteness and inaccuracy of PII records in each platform, linking the exposed PII to users is a non-trivial task. While Entity Resolution (ER) techniques can be used to facilitate this task, they often require ad-hoc, manual rule development and feature engineering. Recently, Deep Learning (DL)-based ER has outperformed manual entity matching rules by automatically extracting prominent features from incomplete or inaccurate records. In this study, we enhance the existing privacy risk assessment with a DL-based ER method, namely Multi-Context Attention (MCA), to comprehensively evaluate individuals’ PII exposure across the different online platforms in the dark web and surface web. Evaluation against benchmark ER models indicates the efficacy of MCA. Using MCA on a random sample of data breach victims in the dark web, we are able to identify 4.3% of the victims on the surface web platforms and calculate their privacy risk scores.
more » « less
Full Text Available
Identifying, Collecting, and Monitoring Personally Identifiable Information: From the Dark Web to the Surface Web

https://doi.org/10.1109/ISI49825.2020.9280540

Liu, Yizhi; Lin, Fang Yu; Ahmad-Post, Zara; Ebrahimi, Mohammadreza; Zhang, Ning; Hu, James Lee; Xin, Jingyu; Li, Weifeng; Chen, Hsinchun (November 2020, IEEE International Conference on Intelligence and Security Informatics (IEEE ISI 2020).)
null (Ed.)
Full Text Available

« Prev Next »

Search for: All records