NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Metamorphic Detection of Repackaged Malware

https://doi.org/10.1109/MET52542.2021.00009

Singh, Shirish; Kaiser, Gail (June 2021, 6th International Workshop on Metamorphic Testing (MET))
null (Ed.)
Machine learning-based malware detection systems are often vulnerable to evasion attacks, in which a malware developer manipulates their malicious software such that it is misclassified as benign. Such software hides some properties of the real class or adopts some properties of a different class by applying small perturbations. A special case of evasive malware hides by repackaging a bonafide benign mobile app to contain malware in addition to the original functionality of the app, thus retaining most of the benign properties of the original app. We present a novel malware detection system based on metamorphic testing principles that can detect such benign-seeming malware apps. We apply metamorphic testing to the feature representation of the mobile app, rather than to the app itself. That is, the source input is the original feature vector for the app and the derived input is that vector with selected features removed. If the app was originally classified benign, and is indeed benign, the output for the source and derived inputs should be the same class, i.e., benign, but if they differ, then the app is exposed as (likely) malware. Malware apps originally classified as malware should retain that classification, since only features prevalent in benign apps are removed. This approach enables the machine learning model to classify repackaged malware with reasonably few false negatives and false positives. Our training pipeline is simpler than many existing ML-based malware detection methods, as the network is trained end-to-end to jointly learn appropriate features and to perform classification. We pre-trained our classifier model on 3 million apps collected from the widely-used AndroZoo dataset. 1 We perform an extensive study on other publicly available datasets to show our approach’s effectiveness in detecting repackaged malware with more than 94% accuracy, 0.98 precision, 0.95 recall, and 0.96 F1 score.
more » « less
Full Text Available
Ad hoc Test Generation Through Binary Rewriting

Saieva, Anthony; Singh, Shirish; Kaiser, Gail (September 2020, IEEE 20th International Working Conference on Source Code Analysis and Manipulation (SCAM))

When a security vulnerability or other critical bug is not detected by the developers' test suite, and is discovered post-deployment, developers must quickly devise a new test that reproduces the buggy behavior. Then the developers need to test whether their candidate patch indeed fixes the bug, without breaking other functionality, while racing to deploy before attackers pounce on exposed user installations. This can be challenging when factors in a specific user environment triggered the bug. If enabled, however, record-replay technology faithfully replays the execution in the developer environment as if the program were executing in that user environment under the same conditions as the bug manifested. This includes intermediate program states dependent on system calls, memory layout, etc. as well as any externally-visible behavior. Many modern record-replay tools integrate interactive debuggers, to help locate the root cause, but don't help the developers test whether their patch indeed eliminates the bug under those same conditions. In particular, modern record-replay tools that reproduce intermediate program state cannot replay recordings made with one version of a program using a different version of the program where the differences affect program state. This work builds on record-replay and binary rewriting to automatically generate and run targeted tests for candidate patches significantly faster and more efficiently than traditional test suite generation techniques like symbolic execution. These tests reflect the arbitrary (ad hoc) user and system circumstances that uncovered the bug, enabling developers to check whether a patch indeed fixes that bug. The tests essentially replay recordings made with one version of a program using a different version of the program, even when the the differences impact program state, by manipulating both the binary executable and the recorded log to result in an execution consistent with what would have happened had the the patched version executed in the user environment under the same conditions where the bug manifested with the original version. Our approach also enables users to make new recordings of their own workloads with the original version of the program, and automatically generate and run the corresponding ad hoc tests on the patched version, to validate that the patch does not break functionality they rely on.
more » « less
Full Text Available
Detecting Sensor-Based Repackaged Malware

https://doi.org/10.1109/BigData50022.2020.9378145

Liu, Boyu; Yun, Duanyue; Guo, Xin; Ji, Xiao; Song, Huiyu; Singh, Shirish; Kaiser, Gail (December 2020, IEEE International Conference on Big Data (Big Data))
null (Ed.)
Android is the most targeted mobile OS. Studies have found that repackaging is one of the most common techniques that adversaries use to distribute malware, and detecting such malware can be difficult because they share large parts of the code with benign apps. Other studies have highlighted the privacy implications of zero-permission sensors. In this work, we investigate if repackaged malicious apps utilize more sensors than the benign counterpart for malicious purposes. We analyzed 15,297 app pairs for sensor usage. We provide evidence that zero-permission sensors are indeed used by malicious apps to perform various activities. We use this information to train a robust classifier to detect repackaged malware in the wild.
more » « less
Full Text Available
Side Channel Attack on Smartphone Sensors to Infer Gender of the User

Singh, Shirish; Shila, Devu Manikantan; Kaiser, Gail (November 2019, 17th ACM Conference on Embedded Networked Sensor Systems (SenSys))

Smartphones incorporate a plethora of diverse and powerful sensors that enhance user experience. Two such sensors are accelerometer and gyroscope, which measure acceleration in all three spatial dimensions and rotation along the three axes of the smartphone, respectively. These sensors are used primarily for screen rotations and advanced gaming applications. However, they can also be employed to gather information about the user’s activity and phone positions. In this work, we investigate using accelerometer and gyroscope as a side-channel to learn highly sensitive information, such as the user’s gender. We present an unobtrusive technique to determine the gender of a user by mining data from the smartphone sensors, which do not require explicit permissions from the user. A preliminary study conducted on 18 participants shows that we can detect the user’s gender with an accuracy of 80%.
more » « less
Full Text Available

Search for: All records