NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Minitrack Introduction: Cybersecurity in the Age of Artificial Intelligence, AI for Cybersecurity, and Cybersecurity for AI

Patton, M; Chen, H; Samtani, S; Zhu, H (January 2024, Proceedings of the 57th Hawaii International Conference on System Sciences)

Cybersecurity and Artificial Intelligence (AI) are key domains whose intersection gives great promises and poses significant threats. Indeed, the National Academy of Science (NAS), the National Science Foundation (NSF), and othßer respected entities have noted the significant role that AI can play in cybersecurity, and the importance of ensuring the security of AI-enabled algorithms and systems. This minitrack focuses on AI and Cybersecurity that works in broader domains, collaborative inter-organizational realms, shared collaborative domains, or with collaborative technologies. The papers in this minitrack have the potential to offer interesting and impactful solutions to emerging areas, including unmanned aerial vehicles and open source software security.
more » « less
Full Text Available
Suggesting Alternatives for Potentially Insecure Artificial IntelligenceRepositories: An Unsupervised Graph Embedding Approach

Lazarine, B; Samtani, S; Zhu, H; Venkataraman, R (January 2024, Hawaii International Conference on Systems Sciences (HICSS), 2024.)

Emerging Artificial Intelligence (AI) applications are bringing with them both the potential for significant societal benefit and harm. Additionally, vulnerabilities within AI source code can make them susceptible to attacks ranging from stealing private data to stealing trained model parameters. Recently, with the adoption of open-source software (OSS) practices, the AI development community has introduced the potential to worsen the number of vulnerabilities present in emerging AI applications, building new applications on top of previous applications, naturally inheriting any vulnerabilities. With the AI OSS community growing rapidly to a scale that requires automated means of analysis for vulnerability management, we compare three categories of unsupervised graph embedding methods capable of generating repository embeddings that can be used to rank existing applications based on their functional similarity for AI developers. The resulting embeddings can be used to suggest alternatives to AI developers for potentially insecure AI repositories.
more » « less
Full Text Available
LINKING EXPLOITS FROM THE DARK WEB TO KNOWN VULNERABILITIES FOR PROACTIVE CYBER THREAT INTELLIGENCE: AN ATTENTION-BASED DEEP STRUCTURED SEMANTIC MODEL1

Samtani, S; Chai, Y; and Chen, H. (June 2022, MIS quarterly)

Black hat hackers use malicious exploits to circumvent security controls and take advantage of system vulnerabilities worldwide, costing the global economy over $450 billion annually. While many organizations are increasingly turning to cyber threat intelligence (CTI) to help prioritize their vulnerabilities, extant CTI processes are often criticized as being reactive to known exploits. One promising data source that can help develop proactive CTI is the vast and ever-evolving Dark Web. In this study, we adopted the computational design science paradigm to design a novel deep learning (DL)-based exploit-vulnerability attention deep structured semantic model (EVA-DSSM) that includes bidirectional processing and attention mechanisms to automatically link exploits from the Dark Web to vulnerabilities. We also devised a novel device vulnerability severity metric (DVSM) that incorporates the exploit post date and vulnerability severity to help cybersecurity professionals with their device prioritization and risk management efforts. We rigorously evaluated the EVA-DSSM against state-of-the-art non-DL and DL-based methods for short text matching on 52,590 exploit-vulnerability linkages across four testbeds: web application, remote, local, and denial of service. Results of these evaluations indicate that the proposed EVA-DSSM achieves precision at 1 scores 20% - 41% higher than non-DL approaches and 4% - 10% higher than DL-based approaches. We demonstrated the EVA-DSSM’s and DVSM’s practical utility with two CTI case studies: openly accessible systems in the top eight U.S. hospitals and over 20,000 Supervisory Control and Data Acquisition (SCADA) systems worldwide. A complementary user evaluation of the case study results indicated that 45 cybersecurity professionals found the EVA-DSSM and DVSM results more useful for exploit-vulnerability linking and risk prioritization activities than those produced by prevailing approaches. Given the rising cost of cyberattacks, the EVA-DSSM and DVSM have important implications for analysts in security operations centers, incident response teams, and cybersecurity vendors.
more » « less
Full Text Available
Linking Exploits from the Dark Web to Known Vulnerabilities for Proactive Cyber Threat Intelligence: An Attention-based Deep Structured Semantic Model

https://doi.org/10.25300/MISQ/2022/15392

Samtani, S; Chai, Y; and Chen, H (June 2022, MIS quarterly)

Black hat hackers use malicious exploits to circumvent security controls and take advantage of system vulnerabilities worldwide, costing the global economy over $450 billion annually. While many organizations are increasingly turning to cyber threat intelligence (CTI) to help prioritize their vulnerabilities, extant CTI processes are often criticized as being reactive to known exploits. One promising data source that can help develop proactive CTI is the vast and ever-evolving Dark Web. In this study, we adopted the computational design science paradigm to design a novel deep learning (DL)-based exploit-vulnerability attention deep structured semantic model (EVA-DSSM) that includes bidirectional processing and attention mechanisms to automatically link exploits from the Dark Web to vulnerabilities. We also devised a novel device vulnerability severity metric (DVSM) that incorporates the exploit post date and vulnerability severity to help cybersecurity professionals with their device prioritization and risk management efforts. We rigorously evaluated the EVA-DSSM against state-of-the-art non-DL and DL-based methods for short text matching on 52,590 exploit-vulnerability linkages across four testbeds: web application, remote, local, and denial of service. Results of these evaluations indicate that the proposed EVA-DSSM achieves precision at 1 scores 20%-41% higher than non-DL approaches and 4%-10% higher than DL-based approaches. We demonstrated the EVA-DSSM's and DVSM's practical utility with two CTI case studies: openly accessible systems in the top eight U.S. hospitals and over 20,000 Supervisory Control and Data Acquisition (SCADA) systems worldwide. A complementary user evaluation of the case study results indicated that 45 cybersecurity professionals found the EVA-DSSM and DVSM results more useful for exploit-vulnerability linking and risk prioritization activities than those produced by prevailing approaches. Given the rising cost of cyberattacks, the EVA-DSSM and DVSM have important implications for analysts in security operations centers, incident response teams, and cybersecurity vendors.
more » « less
Full Text Available
Cross-Lingual Cybersecurity Analytics in the International Dark Web with Adversarial Deep Representation Learning

https://doi.org/10.25300/MISQ/2022/16618

Ebrahimi, M; Chai, Y; Samtani, S; and Chen, H. (June 2022, MIS quarterly)

International dark web platforms operating within multiple geopolitical regions and languages host a myriad of hacker assets such as malware, hacking tools, hacking tutorials, and malicious source code. Cybersecurity analytics organizations employ machine learning models trained on human-labeled data to automatically detect these assets and bolster their situational awareness. However, the lack of human-labeled training data is prohibitive when analyzing foreign-language dark web content. In this research note, we adopt the computational design science paradigm to develop a novel IT artifact for cross-lingual hacker asset detection(CLHAD). CLHAD automatically leverages the knowledge learned from English content to detect hacker assets in non-English dark web platforms. CLHAD encompasses a novel Adversarial deep representation learning (ADREL) method, which generates multilingual text representations using generative adversarial networks (GANs). Drawing upon the state of the art in cross-lingual knowledge transfer, ADREL is a novel approach to automatically extract transferable text representations and facilitate the analysis of multilingual content. We evaluate CLHAD on Russian, French, and Italian dark web platforms and demonstrate its practical utility in hacker asset profiling, and conduct a proof-of-concept case study. Our analysis suggests that cybersecurity managers may benefit more from focusing on Russian to identify sophisticated hacking assets. In contrast, financial hacker assets are scattered among several dominant dark web languages. Managerial insights for security managers are discussed at operational and strategic levels.
more » « less
Full Text Available
Identifying and Categorizing Malicious Content on Paste Sites: A Neural Topic Modeling Approach

https://doi.org/10.1109/ISI53945.2021.9624765

Vahedi, T; Ampel, B; Samtani, S and (November 2021, 2021 IEEE International Conference on Intelligence and Security Informatics)

Malicious cyber activities impose substantial costs on the U.S. economy and global markets. Cyber-criminals often use information-sharing social media platforms such as paste sites (e.g., Pastebin) to share vast amounts of plain text content related to Personally Identifiable Information (PII), credit card numbers, exploit code, malware, and other sensitive content. Paste sites can provide targeted Cyber Threat Intelligence (CTI) about potential threats and prior breaches. In this research, we propose a novel Bidirectional Encoder Representation from Transformers (BERT) with Latent Dirichlet Allocation (LDA) model to categorize pastes automatically. Our proposed BERTLDA model leverages a neural network transformer architecture to capture sequential dependencies when representing each sentence in a paste. BERT-LDA replaces the Bag-of-Words (BoW) approach in the conventional LDA with a Bag-of-Labels (BoL) that encompasses class labels at the sequence level. We compared the performance of the proposed BERT-LDA against the conventional LDA and BERT-LDA variants (e.g., GPT2-LDA) on 4,254,453 pastes from three paste sites. Experiment results indicate that the proposed BERT-LDA outperformed the standard LDA and each BERT-LDA variant in terms of perplexity on each paste site. Results of our BERTLDA case study suggest that significant content relating to hacker community activities, malicious code, network and website vulnerabilities, and PII are shared on paste sites. The insights provided by this study could be used by organizations to proactively mitigate potential damage on their infrastructure.
more » « less
Full Text Available
ACM KDD AI4Cyber: The 1st Workshop on Artificial Intelligence-enabled Cybersecurity Analytics

Samtani, S; Yang, S; and Chen, H. (August 2021, ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021)

Despite significant contributions to various aspects of cybersecurity, cyber-attacks remain on the unfortunate rise. Increasingly, internationally recognized entities such as the National Science Foundation and National Science & Technology Council have noted Artificial Intelligence can help analyze billions of log files, Dark Web data, malware, and other data sources to help execute fundamental cybersecurity tasks. Our objective for the 1st Workshop on Artificial Intelligence-enabled Cybersecurity Analytics (half-day; co-located with ACM KDD) was to gather academic and practitioners to contribute recent work pertaining to AI-enabled cybersecurity analytics. We composed an outstanding, inter-disciplinary Program Committee with significant expertise in various aspects of AI-enabled Cybersecurity Analytics to evaluate the submitted work. Significant contributions to the half-day workshop were made in the areas of CTI, vulnerability assessment, and malware analysis.
more » « less
Full Text Available
ACM KDD AI4Cyber: The 1st Workshop on Artificial Intelligence-enabled Cybersecurity Analytics

https://doi.org/10.1145/3447548.3469450

Samtani, S.; Yang, S.; Chen, H. (June 2021, SIGKDD explorations)

Full Text Available
Exploring the Evolution of Exploit-Sharing Hackers: An Unsupervised Graph Embedding Approach

https://doi.org/10.1109/ISI53945.2021.9624846

Otto, K; Ampel, A; Zhu, H; Samtani, S; and Chen, H. (November 2021, 2021 IEEE Intelligence and Security Informatics (ISI))

Cybercrime was estimated to cost the global economy $945 billion in 2020. Increasingly, law enforcement agencies are using social network analysis (SNA) to identify key hackers from Dark Web hacker forums for targeted investigations. However, past approaches have primarily focused on analyzing key hackers at a single point in time and use a hacker’s structural features only. In this study, we propose a novel Hacker Evolution Identification Framework to identify how hackers evolve within hacker forums. The proposed framework has two novelties in its design. First, the framework captures features such as user statistics, node-level metrics, lexical measures, and post style, when representing each hacker with unsupervised graph embedding methods. Second, the framework incorporates mechanisms to align embedding spaces across multiple time-spells of data to facilitate analysis of how hackers evolve over time. Two experiments were conducted to assess the performance of prevailing graph embedding algorithms and nodal feature variations in the task of graph reconstruction in five timespells. Results of our experiments indicate that Text- Associated Deep-Walk (TADW) with all of the proposed nodal features outperforms methods without nodal features in terms of Mean Average Precision in each time-spell. We illustrate the potential practical utility of the proposed framework with a case study on an English forum with 51,612 posts. The results produced by the framework in this case study identified key hackers posting piracy assets.
more » « less
Full Text Available
A Deep Learning Approach for Recognizing Activity of Daily Living (ADL) for Senior Care: Exploiting Interaction Dependency and Temporal Patterns

Zhu, H.; Samtani, S.; Brown, R.; Chen, H. (June 2021, MIS quarterly)
null (Ed.)
Full Text Available

« Prev Next »

Search for: All records