skip to main content


This content will become publicly available on August 31, 2024

Title: Evading provenance-based ML detectors with adversarial system actions
We present PROVNINJA, a framework designed to generate adversarial attacks that aim to elude provenance-based Machine Learning (ML) security detectors. PROVNINJA is designed to identify and craft adversarial attack vectors that statistically mimic and impersonate system programs. Leveraging the benign execution profile of system processes commonly observed across a multitude of hosts and networks, our research proposes an efficient and effective method to probe evasive alternatives and devise stealthy attack vectors that are difficult to distinguish from benign system behaviors. PROVNINJA's suggestions for evasive attacks, originally derived in the feature space, are then translated into system actions, leading to the realization of actual evasive attack sequences in the problem space. When evaluated against State-of-The-Art (SOTA) detector models using two realistic Advanced Persistent Threat (APT) scenarios and a large collection of fileless malware samples, PROVNINJA could generate and realize evasive attack variants, reducing the detection rates by up to 59%. We also assessed PROVNINJA under varying assumptions on adversaries' knowledge and capabilities. While PROVNINJA primarily considers the black-box model, we also explored two contrasting threat models that consider blind and whitebox attack scenarios.  more » « less
Award ID(s):
1750911
NSF-PAR ID:
10502721
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
SEC '23: Proceedings of the 32nd USENIX Conference on Security Symposium
Date Published:
Page Range / eLocation ID:
1199–1216
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We present PROVNINJA, a framework designed to generate adversarial attacks that aim to elude provenance-based Machine Learning (ML) security detectors. PROVNINJA is designed to identify and craft adversarial attack vectors that statistically mimic and impersonate system programs. Leveraging the benign execution profile of system processes commonly observed across a multitude of hosts and networks, our research proposes an efficient and effective method to probe evasive alternatives and devise stealthy attack vectors that are difficult to distinguish from benign system behaviors. PROVNINJA's suggestions for evasive attacks, originally derived in the feature space, are then translated into system actions, leading to the realization of actual evasive attack sequences in the problem space. When evaluated against State-of-The-Art (SOTA) detector models using two realistic Advanced Persistent Threat (APT) scenarios and a large collection of fileless malware samples, PROVNINJA could generate and realize evasive attack variants, reducing the detection rates by up to 59%. We also assessed PROVNINJA under varying assumptions on adversaries' knowledge and capabilities. While PROVNINJA primarily considers the black-box model, we also explored two contrasting threat models that consider blind and white-box attack scenarios. 
    more » « less
  2. We present PROVNINJA, a framework designed to generate adversarial attacks that aim to elude provenance-based Machine Learning (ML) security detectors. PROVNINJA is designed to identify and craft adversarial attack vectors that statistically mimic and impersonate system programs. Leveraging the benign execution profile of system processes commonly observed across a multitude of hosts and networks, our research proposes an efficient and effective method to probe evasive alternatives and devise stealthy attack vectors that are difficult to distinguish from benign system behaviors. PROVNINJA's suggestions for evasive attacks, originally derived in the feature space, are then translated into system actions, leading to the realization of actual evasive attack sequences in the problem space. When evaluated against State-of-The-Art (SOTA) detector models using two realistic Advanced Persistent Threat (APT) scenarios and a large collection of fileless malware samples, PROVNINJA could generate and realize evasive attack variants, reducing the detection rates by up to 59%. We also assessed PROVNINJA under varying assumptions on adversaries' knowledge and capabilities. While PROVNINJA primarily considers the black-box model, we also explored two contrasting threat models that consider blind and white-box attack scenarios. 
    more » « less
  3. Machine-learning models are known to be vulnerable to evasion attacks, which perturb model inputs to induce misclassifications. In this work, we identify real-world scenarios where the threat cannot be assessed accurately by existing attacks. Specifically, we find that conventional metrics measuring targeted and untargeted robustness do not appropriately reflect a model’s ability to withstand attacks from one set of source classes to another set of target classes. To address the shortcomings of existing methods, we formally define a new metric, termed group-based robustness, that complements existing metrics and is better suited for evaluating model performance in certain attack scenarios. We show empirically that group-based robustness allows us to distinguish between machine-learning models’ vulnerability against specific threat models in situations where traditional robustness metrics do not apply. Moreover, to measure group-based robustness efficiently and accurately, we 1) propose two loss functions and 2) identify three new attack strategies. We show empirically that, with comparable success rates, finding evasive samples using our new loss functions saves computation by a factor as large as the number of targeted classes, and that finding evasive samples, using our new attack strategies, saves time by up to 99% compared to brute-force search methods. Finally, we propose a defense method that increases group-based robustness by up to 3.52 times. 
    more » « less
  4. null (Ed.)
    In recent years, enterprises have been targeted by advanced adversaries who leverage creative ways to infiltrate their systems and move laterally to gain access to critical data. One increasingly common evasive method is to hide the malicious activity behind a benign program by using tools that are already installed on user computers. These programs are usually part of the operating system distribution or another user-installed binary, therefore this type of attack is called “Living-Off-The-Land”. Detecting these attacks is challenging, as adversaries may not create malicious files on the victim computers and anti-virus scans fail to detect them. We propose the design of an Active Learning framework called LOLAL for detecting Living-Off-the-Land attacks that iteratively selects a set of uncertain and anomalous samples for labeling by a human analyst. LOLAL is specifically designed to work well when a limited number of labeled samples are available for training machine learning models to detect attacks. We investigate methods to represent command-line text using word-embedding techniques, and design ensemble boosting classifiers to distinguish malicious and benign samples based on the embedding representation. We leverage a large, anonymized dataset collected by an endpoint security product and demonstrate that our ensemble classifiers achieve an average F1 score of 96% at classifying different attack classes. We show that our active learning method consistently improves the classifier performance, as more training data is labeled, and converges in less than 30 iterations when starting with a small number of labeled instances. 
    more » « less
  5. With the increasing popularity of containerized applications, container registries have hosted millions of repositories that allow developers to store, manage, and share their software. Unfortunately, they have also become a hotbed for adversaries to spread malicious images to the public. In this paper, we present the first in-depth study on the vulnerability of container registries to typosquatting attacks, in which adversaries intentionally upload malicious images with an identification similar to that of a benign image so that users may accidentally download malicious images due to typos. We demonstrate that such typosquatting attacks could pose a serious security threat in both public and private registries as well as across multiple platforms. To shed light on the container registry typosquatting threat, we first conduct a measurement study and a 210-day proof-of-concept exploitation on public container registries, revealing that human users indeed make random typos and download unwanted container images. We also systematically investigate attack vectors on private registries and reveal that its naming space is open and could be easily exploited for launching a typosquatting attack. In addition, for a typosquatting attack across multiple platforms, we demonstrate that adversaries can easily self-host malicious registries or exploit existing container registries to manipulate repositories with similar identifications. Finally, we propose CRYSTAL, a lightweight extension to existing image management, which effectively defends against typosquatting attacks from both container users and registries. 
    more » « less