NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

LocalIntel: Generating Organizational Threat Intelligence from Global and Local Cyber Knowledge

https://doi.org/10.1007/978-3-031-87496-3_5

Mitra, Shaswata; Neupane, Subash; Chakraborty, Trisha; Mittal, Sudip; Piplai, Aritran; Gaur, Manas; Rahimi, Shahram (January 2025, Springer Nature Switzerland)

Full Text Available
Enhancing classroom teaching with LLMs and RAG

https://doi.org/10.1145/3686852.3687083

Mullins, Elizabeth; Portillo, Adrian; Ruiz_Rohena, Kristalys; Piplai, Aritran (October 2024, ACM)

Full Text Available
PrivComp-KG: Leveraging KG and LLM for Compliance Verification

https://doi.org/10.1109/TPS-ISA62245.2024.00021

Garza, Leon; Elluri, Lavanya; Piplai, Aritran; Kotal, Anantaa; Gupta, Deepti; Joshi, Anupam (October 2024, IEEE)

Regulatory documents are complex and lengthy, making full compliance a challenging task for businesses. Similarly, privacy policies provided by vendors frequently fall short of the necessary legal standards due to insufficient detail. To address these issues, we propose a solution that leverages a Large Language Model (LLM) in combination with Semantic Web technology. This approach aims to clarify regulatory requirements and ensure that organizations’ privacy policies align with the relevant legal frameworks, ultimately simplifying the compliance process, reducing privacy risks, and improving efficiency. In this paper, we introduce a novel tool, the Privacy Policy Compliance Verification Knowledge Graph, referred to as PrivComp-KG. PrivComp-KG is designed to efficiently store and retrieve comprehensive information related to privacy policies, regulatory frameworks, and domain-specific legal knowledge. By utilizing LLM and Retrieval Augmented Generation (RAG), we can accurately identify relevant sections in privacy policies and map them to the corresponding regulatory rules. Our LLM-based retrieval system has demonstrated a high level of accuracy, achieving a correctness score of 0.9, outperforming other models in privacy policy analysis. The extracted information from individual privacy policies is then integrated into the PrivComp-KG. By combining this data with contextual domain knowledge and regulatory rules, PrivComp-KG can be queried to assess each vendor’s compliance with applicable regulations. We demonstrate the practical utility of PrivComp-KG by verifying the compliance of privacy policies across various organizations. This approach not only helps policy writers better understand legal requirements but also enables them to identify gaps in existing policies and update them in response to evolving regulations.
more » « less
Full Text Available
Knowledge-Enhanced Neurosymbolic Artificial Intelligence for Cybersecurity and Privacy

https://doi.org/10.1109/MIC.2023.3299435

Piplai, Aritran; Kotal, Anantaa; Mohseni, Seyedreza; Gaur, Manas; Mittal, Sudip; Joshi, Anupam (September 2023, IEEE Internet Computing)

Neurosymbolic artificial intelligence (AI) is an emerging and quickly advancing field that combines the subsymbolic strengths of (deep) neural networks and the explicit, symbolic knowledge contained in knowledge graphs (KGs) to enhance explainability and safety in AI systems. This approach addresses a key criticism of current generation systems, namely, their inability to generate human-understandable explanations for their outcomes and ensure safe behaviors, especially in scenarios with unknown unknowns (e.g., cybersecurity, privacy). The integration of neural networks, which excel at exploring complex data spaces, and symbolic KGs representing domain knowledge, allows AI systems to reason, learn, and generalize in a manner understandable to experts. This article describes how applications in cybersecurity and privacy, two of the most demanding domains in terms of the need for AI to be explainable while being highly accurate in complex environments, can benefit from neurosymbolic AI.
more » « less
Full Text Available
Offline RL+CKG: A hybrid AI model for cybersecurity tasks

Piplai, Aritran; Joshi, Anupam; Finin, Tim (April 2023, Proceedings of the AAAI 2023 Spring Symposium on Challenges Requiring the Combination of Machine Learning and Knowledge Engineering (AAAI-MAKE 2023))
Martin, A; Hinkelmann, K; Fill, H.-G.; Gerber, A.; Lenat, D.; Stolle, R.; van Harmelen, F. (Ed.)
AI models for cybersecurity have to detect and defend against constantly evolving cyber threats. Much effort is spent building defenses for zero days and unseen variants of known cyber-attacks. Current AI models for cybersecurity struggle with these yet unseen threats due to the constantly evolving nature of threat vectors, vulnerabilities, and exploits. This paper shows that cybersecurity AI models will be improved and more general if we include semi-structured representations of background knowledge. This could include information about the software and systems, as well as information obtained from observing the behavior of malware samples captured and detonated in honeypots. We describe how we can transfer this knowledge into forms that the RL models can directly use for decision-making purposes.
more » « less
Full Text Available
Knowledge Guided Two-player Reinforcement Learning for Cyber Attacks and Defenses

https://doi.org/10.1109/icmla55696.2022.00213

Piplai, Aritran; Anoruo, Mike; Fasaye, Kayode; Joshi, Anupam; Finin, Tim; Ridley, Ahmad (December 2022, 21st IEEE International Conference on Machine Learning and Applications)

Cyber defense exercises are an important avenue to understand the technical capacity of organizations when faced with cyber-threats. Information derived from these exercises often leads to finding unseen methods to exploit vulnerabilities in an organization. These often lead to better defense mechanisms that can counter previously unknown exploits. With recent developments in cyber battle simulation platforms, we can generate a defense exercise environment and train reinforcement learning (RL) based autonomous agents to attack the system described by the simulated environment. In this paper, we describe a two-player game-based RL environment that simultaneously improves the performance of both the attacker and defender agents. We further accelerate the convergence of the RL agents by guiding them with expert knowledge from Cybersecurity Knowledge Graphs on attack and mitigation steps. We have implemented and integrated our proposed approaches into the CyberBattleSim system.
more » « less
Full Text Available
Combating Fake Cyber Threat Intelligence using Provenance in Cybersecurity Knowledge Graphs

https://doi.org/10.1109/BigData52589.2021.9671867

Mitra, Shaswata; Piplai, Aritran; Mittal, Sudip; Joshi, Anupam (December 2021, 2021 IEEE International Conference on Big Data (Big Data))

Today there is a significant amount of fake cybersecurity related intelligence on the internet. To filter out such information, we build a system to capture the provenance information and represent it along with the captured Cyber Threat Intelligence (CTI). In the cybersecurity domain, such CTI is stored in Cybersecurity Knowledge Graphs (CKG). We enhance the exiting CKG model to incorporate intelligence provenance and fuse provenance graphs with CKG. This process includes modifying traditional approaches to entity and relation extraction. CTI data is considered vital in securing our cyberspace. Knowledge graphs containing CTI information along with its provenance can provide expertise to dependent Artificial Intelligence (AI) systems and human analysts.
more » « less
Full Text Available
CyBERT: Contextualized Embeddings for the Cybersecurity Domain

https://doi.org/10.1109/BigData52589.2021.9671824

Ranade, Priyanka; Piplai, Aritran; Joshi, Anupam; Finin, Tim (December 2021, IEEE International Conference on Big Data)

We present CyBERT, a domain-specific Bidirectional Encoder Representations from Transformers (BERT) model, fine-tuned with a large corpus of textual cybersecurity data. State-of-the-art natural language models that can process dense, fine-grained textual threat, attack, and vulnerability information can provide numerous benefits to the cybersecurity community. The primary contribution of this paper is providing the security community with an initial fine-tuned BERT model that can perform a variety of cybersecurity-specific downstream tasks with high accuracy and efficient use of resources. We create a cybersecurity corpus from open-source unstructured and semi-unstructured Cyber Threat Intelligence (CTI) data and use it to fine-tune a base BERT model with Masked Language Modeling (MLM) to recognize specialized cybersecurity entities. We evaluate the model using various downstream tasks that can benefit modern Security Operations Centers (SOCs). The finetuned CyBERT model outperforms the base BERT model in the domain-specific MLM evaluation. We also provide use-cases of CyBERT application in cybersecurity based downstream tasks.
more » « less
Full Text Available
Generating Fake Cyber Threat Intelligence Using Transformer-Based Models

https://doi.org/10.1109/IJCNN52387.2021.9534192

Ranade, Priyanka; Piplai, Aritran; Mittal, Sudip; Joshi, Anupam; Finin, Tim (July 2021, 2021 International Joint Conference on Neural Networks (IJCNN))

Full Text Available
Using Knowledge Graphs and Reinforcement Learning for Malware Analysis

https://doi.org/10.1109/BigData50022.2020.9378491

Piplai, Aritran; Ranade, Priyanka; Kotal, Anantaa; Mittal, Sudip; Narayanan, Sandeep Nair; Joshi, Anupam (December 2020, 2020 IEEE International Conference on Big Data (Big Data))
null (Ed.)
Machine learning algorithms used to detect attacks are limited by the fact that they cannot incorporate the back-ground knowledge that an analyst has. This limits their suitability in detecting new attacks. Reinforcement learning is different from traditional machine learning algorithms used in the cybersecurity domain. Compared to traditional ML algorithms, reinforcement learning does not need a mapping of the input-output space or a specific user-defined metric to compare data points. This is important for the cybersecurity domain, especially for malware detection and mitigation, as not all problems have a single, known, correct answer. Often, security researchers have to resort to guided trial and error to understand the presence of a malware and mitigate it.In this paper, we incorporate prior knowledge, represented as Cybersecurity Knowledge Graphs (CKGs), to guide the exploration of an RL algorithm to detect malware. CKGs capture semantic relationships between cyber-entities, including that mined from open source. Instead of trying out random guesses and observing the change in the environment, we aim to take the help of verified knowledge about cyber-attack to guide our reinforcement learning algorithm to effectively identify ways to detect the presence of malicious filenames so that they can be deleted to mitigate a cyber-attack. We show that such a guided system outperforms a base RL system in detecting malware.
more » « less
Full Text Available

« Prev Next »

Search for: All records