As the underground market of malware flourishes, there is an exponential increase in the number and diversity of malware. A crucial question in malware analysis research is how to define malware specifications or signatures that faithfully describe similar malicious intent and also clearly stand out from other programs. Although the traditional malware specifications based on syntactic signatures are efficient, they can be easily defeated by various obfuscation techniques. Since the malicious behavior is often stable across similar malware instances, behavior-based specifications which capture real malicious characteristics during run time, have become more prevalent in anti-malware tasks, such as malware detection and malware clustering. This kind of specification is typically extracted from the system call dependence graph that a malware sample invokes. In this paper, we present replacement attacks to cam- ouflage similar behaviors by poisoning behavior-based specifications. The key method of our attacks is to replace a system call dependence graph to its semantically equivalent variants so that the similar malware samples within one family turn out to be different. As a result, malware analysts have to put more efforts into reexamining the similar samples which may have been investigated before. We distil general attacking strategies by mining more than 5, 200 malware samples’ behavior specifications and implement a compiler-level prototype to automate replacement attacks. Experiments on 960 real malware samples demonstrate the effectiveness of our approach to impede various behavior-based mal- ware analysis tasks, such as similarity comparison and malware clustering. In the end, we also discuss possible countermeasures in order to strengthen existing mal- ware defense.
more »
« less
MAB-Malware: A Reinforcement Learning Framework for Blackbox Generation of Adversarial Malware
- Award ID(s):
- 1719175
- NSF-PAR ID:
- 10382411
- Date Published:
- Journal Name:
- ASIA CCS '22: Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security
- Page Range / eLocation ID:
- 990 to 1003
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Android is the most targeted mobile OS. Studies have found that repackaging is one of the most common techniques that adversaries use to distribute malware, and detecting such malware can be difficult because they share large parts of the code with benign apps. Other studies have highlighted the privacy implications of zero-permission sensors. In this work, we investigate if repackaged malicious apps utilize more sensors than the benign counterpart for malicious purposes. We analyzed 15,297 app pairs for sensor usage. We provide evidence that zero-permission sensors are indeed used by malicious apps to perform various activities. We use this information to train a robust classifier to detect repackaged malware in the wild.more » « less
-
null (Ed.)Machine learning-based malware detection systems are often vulnerable to evasion attacks, in which a malware developer manipulates their malicious software such that it is misclassified as benign. Such software hides some properties of the real class or adopts some properties of a different class by applying small perturbations. A special case of evasive malware hides by repackaging a bonafide benign mobile app to contain malware in addition to the original functionality of the app, thus retaining most of the benign properties of the original app. We present a novel malware detection system based on metamorphic testing principles that can detect such benign-seeming malware apps. We apply metamorphic testing to the feature representation of the mobile app, rather than to the app itself. That is, the source input is the original feature vector for the app and the derived input is that vector with selected features removed. If the app was originally classified benign, and is indeed benign, the output for the source and derived inputs should be the same class, i.e., benign, but if they differ, then the app is exposed as (likely) malware. Malware apps originally classified as malware should retain that classification, since only features prevalent in benign apps are removed. This approach enables the machine learning model to classify repackaged malware with reasonably few false negatives and false positives. Our training pipeline is simpler than many existing ML-based malware detection methods, as the network is trained end-to-end to jointly learn appropriate features and to perform classification. We pre-trained our classifier model on 3 million apps collected from the widely-used AndroZoo dataset. 1 We perform an extensive study on other publicly available datasets to show our approach’s effectiveness in detecting repackaged malware with more than 94% accuracy, 0.98 precision, 0.95 recall, and 0.96 F1 score.more » « less
-
null (Ed.)This tutorial provides a review of the state-of-the-art research and the applications of Artificial Intelligence and Machine Learning for malware analysis. We will provide an overview, background and results with respect to the three main malware analysis approaches: static malware analysis, dynamic malware analysis and online malware analysis. Further, we will provide a simplified hands-on tutorial of applying ML algorithm for dynamic malware analysis in cloud IaaS.more » « less
-
Combating the OS-level malware is a very challenging problem as this type of malware can compromise the operating system, obtaining the kernel privilege and subverting almost all the existing anti-malware tools. This work aims to address this problem in the context of mobile devices. As real-world malware is very heterogeneous, we narrow down the scope of our work by especially focusing on a special type of OS-level malware that always corrupts user data. We have designed mobiDOM, the first framework that can combat the OS-level data corruption malware for mobile computing devices. Our mobiDOM contains two components, a malware detector and a data repairer. The malware detector can securely and timely detect the presence of OS-level malware by fully utilizing the existing hardware features of a mobile device, namely, flash memory and Arm TrustZone. Specifically, we integrate the malware detection into the flash translation layer (FTL), a firmware layer embedded into the flash storage hardware, which is inaccessible to the OS; in addition, we run a trusted application in the Arm TrustZone secure world, which acts as a user-level manager of the malware detector. The FTL-based malware detection and the TrustZone-based manager can communicate with each other stealthily via steganography. The data repairer can allow restoring the external storage to a healthy historical state by taking advantage of the out-of-place-update feature of flash memory and our malware-aware garbage collection in the FTL. Security analysis and experimental evaluation on a real-world testbed confirm the effectiveness of mobiDOM.more » « less