With the rapid technological advancement, security has become a major issue due to the increase in malware activity that poses a serious threat to the security and safety of both computer systems and stakeholders. To maintain stakeholder’s, particularly, end user’s security, protecting the data from fraudulent efforts is one of the most pressing concerns. A set of malicious programming code, scripts, active content, or intrusive software that is designed to destroy intended computer systems and programs or mobile and web applications is referred to as malware. According to a study, naive users are unable to distinguish between malicious and benign applications. Thus, computer systems and mobile applications should be designed to detect malicious activities towards protecting the stakeholders. A number of algorithms are available to detect malware activities by utilizing novel concepts including Artificial Intelligence, Machine Learning, and Deep Learning. In this study, we emphasize Artificial Intelligence (AI) based techniques for detecting and preventing malware activity. We present a detailed review of current malware detection technologies, their shortcomings, and ways to improve efficiency. Our study shows that adopting futuristic approaches for the development of malware detection applications shall provide significant advantages. The comprehension of this synthesis shall help researchers for further research on malware detection and prevention using AI.
more »
« less
This content will become publicly available on December 9, 2025
DEEPCAPA: Identifying Malicious Capabilities in Windows Malware
Malware detection and classification has been the focus of extensive research over many years. However, less effort has been devoted to developing post-detection systems that identify specific malicious capabilities (or behaviors) in malware. Such systems play a critical part in identifying and mitigating the damage caused by malware attacks. Unfortunately, current methods for identifying malware capabilities involve substantial manual reverse engineering efforts and context switching between multiple tools, which slows down an investigation and gives attackers an advantage. In this paper, we propose DEEPCAPA, an automated postdetection system that uses machine learning to identify potentially malicious capabilities in malware in the form of MITRE ATT&CK techniques. Our system operates on sequences of API calls, extracted from the memory snapshots taken at key points during the (sandboxed) execution of malware. Our results demonstrate that DEEPCAPA can accurately identify malicious capabilities, achieving a precision of 95.80% and a recall of 93.76% across 29 different techniques.
more »
« less
- Award ID(s):
- 2229876
- PAR ID:
- 10594823
- Publisher / Repository:
- IEEE
- Date Published:
- ISBN:
- 979-8-3315-2088-5
- Page Range / eLocation ID:
- 826 to 842
- Format(s):
- Medium: X
- Location:
- Honolulu, HI, USA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Prof. Ninghui Li Editor in Chief, ACM Transactions (Ed.)Malware analysis is an essential task to understand infection campaigns, the behavior of malicious codes, and possible ways to mitigate threats. Malware analysis also allows better assessment of attacker’s capabilities, techniques, and processes. Although a substantial amount of previous work provided a comprehensive analysis of the international malware ecosystem, research on regionalized, country, and population-specific malware campaigns have been scarce. Moving towards addressing this gap, we conducted a longitudinal (2012-2020) and comprehensive (encompassing an entire population of online banking users) study of MS Windows desktop malware that actually infected Brazilian bank’s users. We found that the Brazilian financial desktop malware has been evolving quickly: it started to make use of a variety of file formats instead of typical PE binaries, relied on native system resources, and abused obfuscation technique to bypass detection mechanisms. Our study on the threats targeting a significant population on the ecosystem of the largest and most populous country in Latin America can provide invaluable insights that may be applied to other countries’ user populations, especially those in the developing world that might face cultural peculiarities similar to Brazil’s. With this evaluation, we expect to motivate the security community/industry to seriously considering a deeper level of customization during the development of next generation anti-malware solutions, as well as to raise awareness towards regionalized and targeted Internet threats.more » « less
-
null (Ed.)Malicious software, popularly known as malware, is widely acknowledged as a serious threat to modern computing systems. Software-based solutions, such as anti-virus software, are not effective since they rely on matching patterns that can be easily fooled by carefully crafted malware with obfuscation or other deviation capabilities. While recent malware detection methods provide promising results through effective utilization of hardware features, the detection results cannot be interpreted in a meaningful way. In this paper, we propose a hardware-assisted malware detection framework using explainable machine learning. This paper makes three important contributions. First, we theoretically establish that our proposed method can provide interpretable explanation of classification results to address the challenge of transparency. Next, we show that the explainable outcome can lead to accurate localization of malicious behaviors. Finally, experimental evaluation using a wide variety of realworld malware benchmarks demonstrates that our framework can produce accurate and human-understandable malware detection results with provable guarantees.more » « less
-
Thanks to the numerous machine learning based malware detection (MLMD) research in recent years and the readily available online malware scanning system (e.g., VirusTotal), it becomes relatively easy to build a seemingly successful MLMD system using the following standard procedure: first prepare a set of ground truth data by checking with VirusTotal, then extract features from training dataset and build a machine learning detection model, and finally evaluate the model with a disjoint testing dataset. We argue that such evaluation methods do not expose the real utility of ML based malware detection in practice since the ML model is both built and tested on malware that are known at the time of training. The user could simply run them through VirusTotal just as how the researchers obtained the ground truth, instead of using the more sophisticated ML approach. However, ML based malware detection has the potential of identifying malware that has not been known at the time of training, which is the real value ML brings to this problem. We present experimentation study on how well a machine learning based malware detection system can achieve this. Our experiments showed that MLMD can consistently generate previously unknown malware knowledge, e.g., malware that is not detectable by existing malware detection systems at MLMD’s training time. Our research illustrates an ideal usage scenario for MLMD systems and demonstrates that such systems can benefit malware detection in practice. For example, by utilizing the new signals provided by the MLMD system and the detection capability of existing malware detection systems, we can more quickly uncover new malware variants or families.more » « less
-
The ubiquitous deployment of robots across diverse domains, from industrial automation to personal care, underscores their critical role in modern society. However, this growing dependence has also revealed security vulnerabilities. An attack vector involves the deployment of malicious software (malware) on robots, which can cause harm to robots themselves, users, and even the surrounding environment. Machine learning approaches, particularly supervised ones, have shown promise in malware detection by building intricate models to identify known malicious code patterns. However, these methods are inherently limited in detecting unseen or zero-day malware variants as they require regularly updated massive datasets that might be unavailable to robots. To address this challenge, we introduce ROBOGUARDZ, a novel malware detection framework based on zero-shot learning for robots. This approach allows ROBOGUARDZ to identify unseen malware by establishing relationships between known malicious code and benign behaviors, allowing detection even before the code executes on the robot. To ensure practical deployment in resource-constrained robotic hardware, we employ a unique parallel structured pruning and quantization strategy that compresses the ROBOGUARDZ detection model by 37.4% while maintaining its accuracy. This strategy reduces the size of the model and computational demands, making it suitable for real-world robotic systems. We evaluated ROBOGUARDZ on a recent dataset containing real-world binary executables from multi-sensor autonomous car controllers. The framework was deployed on two popular robot embedded hardware platforms. Our results demonstrate an average detection accuracy of 94.25% and a low false negative rate of 5.8% with a minimal latency of 20 ms, which demonstrates its effectiveness and practicality.more » « less
An official website of the United States government
