Using Knowledge Graphs and Reinforcement Learning for Malware Analysis

Piplai, Aritran; Ranade, Priyanka; Kotal, Anantaa; Mittal, Sudip; Narayanan, Sandeep Nair; Joshi, Anupam

doi:10.1109/BigData50022.2020.9378491

Citation Details

Using Knowledge Graphs and Reinforcement Learning for Malware Analysis

Machine learning algorithms used to detect attacks are limited by the fact that they cannot incorporate the back-ground knowledge that an analyst has. This limits their suitability in detecting new attacks. Reinforcement learning is different from traditional machine learning algorithms used in the cybersecurity domain. Compared to traditional ML algorithms, reinforcement learning does not need a mapping of the input-output space or a specific user-defined metric to compare data points. This is important for the cybersecurity domain, especially for malware detection and mitigation, as not all problems have a single, known, correct answer. Often, security researchers have to resort to guided trial and error to understand the presence of a malware and mitigate it.In this paper, we incorporate prior knowledge, represented as Cybersecurity Knowledge Graphs (CKGs), to guide the exploration of an RL algorithm to detect malware. CKGs capture semantic relationships between cyber-entities, including that mined from open source. Instead of trying out random guesses and observing the change in the environment, we aim to take the help of verified knowledge about cyber-attack to guide our reinforcement learning algorithm to effectively identify ways to detect the presence of malicious filenames so that they can be deleted to mitigate a cyber-attack. We show that such a guided system outperforms a base RL system in detecting malware. more »

Award ID(s):: 2025685 2133190

NSF-PAR ID:: 10229649

Author(s) / Creator(s):: Piplai, Aritran; Ranade, Priyanka; Kotal, Anantaa; Mittal, Sudip; Narayanan, Sandeep Nair; Joshi, Anupam

Date Published:: 2020-12-10

Journal Name:: 2020 IEEE International Conference on Big Data (Big Data)

Page Range / eLocation ID:: 2626 to 2633

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/BigData50022.2020.9378491

More Like this