NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A User Experience Study of MeetingMayhem: A Web-Based Game to Teach Adversarial Thinking

https://doi.org/10.1145/3649217.3653538

Huang, Shan; Lee, JiWoo; Zhao, Chenyan; Herman, Geoffrey L; Olano, Marc; Oliva, Linda; Sherman, Alan (July 2024, ACM)

Full Text Available
GenAIPABench: A Benchmark for Generative AI-based Privacy Assistants

https://doi.org/10.56553/popets-2024-0081

Hamid, Aamir; Samidi, Hemanth Reddy; Pappachan, Primal; Finin, Tim; Yus, Roberto (July 2024, Proceedings on Privacy Enhancing Technologies)

Website privacy policies are often lengthy and intricate. Privacy assistants assist in simplifying policies and making them more accessible and user-friendly. The emergence of generative AI (genAI) offers new opportunities to build privacy assistants that can answer users’ questions about privacy policies. However, genAI’s reliability is a concern due to its potential for producing inaccurate information. This study introduces GenAIPABench, a benchmark for evaluating Generative AI-based Privacy Assistants (GenAIPAs). GenAIPABench includes: 1) A set of curated questions about privacy policies along with annotated answers for various organizations and regulations; 2) Metrics to assess the accuracy, relevance, and consistency of responses; and 3) A tool for generating prompts to introduce privacy policies and paraphrased variants of the curated questions. We evaluated 3 leading genAI systems—ChatGPT-4, Bard, and Bing AI—using GenAIPABench to gauge their effectiveness as GenAIPAs. Our results demonstrate significant promise in genAI capabilities in the privacy domain while also highlighting challenges in managing complex queries, ensuring consistency, and verifying source accuracy.
more » « less
Full Text Available
Enhancing Knowledge Graph Consistency through Open Large Language Models: A Case Study

https://doi.org/10.1609/aaaiss.v3i1.31201

Padia, Ankur; Ferraro, Francis; Finin, Tim (May 2024, Proceedings of the AAAI Symposium Series)

High-quality knowledge graphs (KGs) play a crucial role in many applications. However, KGs created by automated information extraction systems can suffer from erroneous extractions or be inconsistent with provenance/source text. It is important to identify and correct such problems. In this paper, we study leveraging the emergent reasoning capabilities of large language models (LLMs) to detect inconsistencies between extracted facts and their provenance. With a focus on ``open'' LLMs that can be run and trained locally, we find that few-shot approaches can yield an absolute performance gain of 2.5-3.4% over the state-of-the-art method with only 9% of training data. We examine the LLM architectures' effect and show that Decoder-Only models underperform Encoder-Decoder approaches. We also explore how model size impacts performance and counterintuitively find that larger models do not result in consistent performance gains. Our detailed analyses suggest that while LLMs can improve KG consistency, the different LLM models learn different aspects of KG consistency and are sensitive to the number of entities involved.
more » « less
Full Text Available
Employing word-embedding for schema matching in standard lifecycle management

https://doi.org/10.1016/j.jii.2023.100547

Oh, Hakju; Kulvatunyou, Boonserm Serm; Jones, Albert; Finin, Tim (March 2024, Journal of Industrial Information Integration)

Full Text Available
Privacy-Preserving Data Sharing in Agriculture: Enforcing Policy Rules for Secure and Confidential Data Synthesis

https://doi.org/10.1109/BigData59044.2023.10386276

Kotal, Anantaa; Elluri, Lavanya; Gupta, Deepti; Mandalapu, Varun; Joshi, Anupam (December 2023, IEEE International Conference on Big Data)

Big Data empowers the farming community with the information needed to optimize resource usage, increase productivity, and enhance the sustainability of agricultural practices. The use of Big Data in farming requires the collection and analysis of data from various sources such as sensors, satellites, and farmer surveys. While Big Data can provide the farming community with valuable insights and improve efficiency, there is significant concern regarding the security of this data as well as the privacy of the participants. Privacy regulations, such as the European Union’s General Data Protection Regulation (GDPR), the EU Code of Conduct on agricultural data sharing by contractual agreement, and the proposed EU AI law, have been created to address the issue of data privacy and provide specific guidelines on when and how data can be shared between organizations. To make confidential agricultural data widely available for Big Data analysis without violating the privacy of the data subjects, we consider privacy-preserving methods of data sharing in agriculture. Synthetic data that retains the statistical properties of the original data but does not include actual individuals’ information provides a suitable alternative to sharing sensitive datasets. Deep learning-based synthetic data generation has been proposed for privacy-preserving data sharing. However, there is a lack of compliance with documented data privacy policies in such privacy-preserving efforts. In this study, we propose a novel framework for enforcing privacy policy rules in privacy-preserving data generation algorithms. We explore several available agricultural codes of conduct, extract knowledge related to the privacy constraints in data, and use the extracted knowledge to define privacy bounds in a privacy-preserving generative model. We use our framework to generate synthetic agricultural data and present experimental results that demonstrate the utility of the synthetic dataset in downstream tasks. We also show that our framework can evade potential threats, such as re-identification and linkage issues, and secure data based on applicable regulatory policy rules.
more » « less
Full Text Available
Change Management using Generative Modeling on Digital Twins

https://doi.org/10.1109/ISI58743.2023.10297181

Das, Nilanjana; Kotal, Anantaa; Roseberry, Daniel; Joshi, Anupam (October 2023, IEEE International Conference on Intelligence and Security Informatics)

A key challenge faced by small and medium-sized business entities is securely managing software updates and changes. Specifically, with rapidly evolving cybersecurity threats, changes/updates/patches to software systems are necessary to stay ahead of emerging threats and are often mandated by regulators or statutory authorities to counter these. However, security patches/updates require stress testing before they can be released in the production system. Stress testing in production environments is risky and poses security threats. Large businesses usually have a non-production environment where such changes can be made and tested before being released into production. Smaller businesses do not have such facilities. In this work, we show how “digital twins”, especially for a mix of IT and IoT environments, can be created on the cloud. These digital twins act as a non-production environment where changes can be applied, and the system can be securely tested before patch release. Additionally, the non-production digital twin can be used to collect system data and run stress tests on the environment, both manually and automatically. In this paper, we show how using a small sample of real data/interactions, Generative Artificial Intelligence (AI) models can be used to generate testing scenarios to check for points of failure.
more » « less
Full Text Available
Knowledge-Enhanced Neurosymbolic Artificial Intelligence for Cybersecurity and Privacy

https://doi.org/10.1109/MIC.2023.3299435

Piplai, Aritran; Kotal, Anantaa; Mohseni, Seyedreza; Gaur, Manas; Mittal, Sudip; Joshi, Anupam (September 2023, IEEE Internet Computing)

Neurosymbolic artificial intelligence (AI) is an emerging and quickly advancing field that combines the subsymbolic strengths of (deep) neural networks and the explicit, symbolic knowledge contained in knowledge graphs (KGs) to enhance explainability and safety in AI systems. This approach addresses a key criticism of current generation systems, namely, their inability to generate human-understandable explanations for their outcomes and ensure safe behaviors, especially in scenarios with unknown unknowns (e.g., cybersecurity, privacy). The integration of neural networks, which excel at exploring complex data spaces, and symbolic KGs representing domain knowledge, allows AI systems to reason, learn, and generalize in a manner understandable to experts. This article describes how applications in cybersecurity and privacy, two of the most demanding domains in terms of the need for AI to be explainable while being highly accurate in complex environments, can benefit from neurosymbolic AI.
more » « less
Full Text Available

Search for: All records