Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Malicious software (malware) poses a significant threat to the security of our networks and users. In the ever-evolving malware landscape, Excel 4.0 Office macros (XL4) have recently become an important attack vector. These macros are often hidden within apparently legitimate documents and under several layers of obfuscation. As such, they are difficult to analyze using static analysis techniques. Moreover, the analysis in a dynamic analysis environment (a sandbox) is challenging because the macros execute correctly only under specific environmental conditions that are not always easy to create. This paper presents SYMBEXCEL, a novel solution that leverages symbolic execution to deobfuscate and analyze Excel 4.0 macros automatically. Our approach proceeds in three stages: (1) The malicious document is parsed and loaded in memory; (2) Our symbolic execution engine executes the XL4 formulas; and (3) Our Engine concretizes any symbolic values encountered during the symbolic exploration, therefore evaluating the execution of each macro under a broad range of (meaningful) environment configurations. SYMBEXCEL significantly outperforms existing deobfuscation tools, allowing us to reliably extract Indicators of Compromise (IoCs) and other critical forensics information. Our experiments demonstrate the effectiveness of our approach, especially in deobfuscating novel malicious documents that make heavy use of environment variables and are often not identified by commercial anti-virus software.more » « less
-
Prof. Ninghui Li Editor in Chief, ACM Transactions (Ed.)Malware analysis is an essential task to understand infection campaigns, the behavior of malicious codes, and possible ways to mitigate threats. Malware analysis also allows better assessment of attacker’s capabilities, techniques, and processes. Although a substantial amount of previous work provided a comprehensive analysis of the international malware ecosystem, research on regionalized, country, and population-specific malware campaigns have been scarce. Moving towards addressing this gap, we conducted a longitudinal (2012-2020) and comprehensive (encompassing an entire population of online banking users) study of MS Windows desktop malware that actually infected Brazilian bank’s users. We found that the Brazilian financial desktop malware has been evolving quickly: it started to make use of a variety of file formats instead of typical PE binaries, relied on native system resources, and abused obfuscation technique to bypass detection mechanisms. Our study on the threats targeting a significant population on the ecosystem of the largest and most populous country in Latin America can provide invaluable insights that may be applied to other countries’ user populations, especially those in the developing world that might face cultural peculiarities similar to Brazil’s. With this evaluation, we expect to motivate the security community/industry to seriously considering a deeper level of customization during the development of next generation anti-malware solutions, as well as to raise awareness towards regionalized and targeted Internet threats.more » « less
-
Machine learning techniques are widely used in addition to signatures and heuristics to increase the detection rate of anti-malware software, as they automate the creation of detection models, making it possible to handle an ever-increasing number of new malware samples. In order to foil the analysis of anti-malware systems and evade detection, malware uses packing and other forms of obfuscation. However, few realize that benign applications use packing and obfuscation as well, to protect intellectual property and prevent license abuse. In this paper, we study how machine learning based on static analysis features operates on packed samples. Malware researchers have often assumed that packing would prevent machine learning techniques from building effective classifiers. However, both industry and academia have published results that show that machine-learning-based classifiers can achieve good detection rates, leading many experts to think that classifiers are simply detecting the fact that a sample is packed, as packing is more prevalent in malicious samples. We show that, different from what is commonly assumed, packers do preserve some information when packing programs that is “useful” for malware classification. However, this information does not necessarily capture the sample’s behavior. We demonstrate that the signals extracted from packed executables are not rich enough for machine-learning-based models to (1) generalize their knowledge to operate on unseen packers, and (2) be robust against adversarial examples. We also show that a na¨ıve application of machine learning techniques results in a substantial number of false positives, which, in turn, might have resultedmore » « less