skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on November 27, 2025

Title: Generating Effective Answers to People’s Everyday Cybersecurity Questions: An Initial Study
Human users are often the weakest link in cybersecurity, with a large percentage of security breaches attributed to some kind of human error. When confronted with everyday cybersecurity questions - or any other question for that matter, users tend to turn to their search engines, online forums, and, recently, chatbots. We report on a study on the effectiveness of answers generated by two popular chatbots to an initial set of questions related to typical cybersecurity challenges faced by users (e.g., phishing, use of VPN, multi-factor authentication). The study does not only look at the accuracy of the answers generated by the chatbots but also at whether these answers are understandable, whether they are likely to motivate users to follow any provided recommendations, and whether these recommendations are actionable. Surprisingly enough, this initial study suggests that state-of-the-art chatbots are already reasonably good at providing accurate answers to common cybersecurity questions. Yet the study also suggests that the chatbots are not very effective when it comes to generating answers that are relevant, actionable, and, most importantly, likely to motivate users to heed their recommendations. The study proceeds with the design and evaluation of prompt engineering techniques intended to improve the effectiveness of answers generated by the chatbots. Initial results suggest that it is possible to improve the effectiveness of answers and, in particular, their likelihood of motivating users to heed recommendations, and their ability to act upon these recommendations without diminishing their accuracy. We discuss the implications of these initial results and plans for future work in this area.  more » « less
Award ID(s):
1914486
PAR ID:
10595733
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
Springer Nature Singapore
Date Published:
ISBN:
978-981-96-0576-7
Page Range / eLocation ID:
363 to 379
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Human actions or lack thereof contribute to a large majority of cybersecurity incidents. Traditionally, when looking for advice on cybersecurity questions, people have turned to search engines or social sites like Reddit. The rapid adoption of chatbot technologies is offering a potentially more direct way of getting similar advice. Initial research suggests, however, that while chatbot answers to common cybersecurity questions tend to be fairly accurate, they may not be very effective as they often fall short on other desired qualities such as understandability, actionability, or motivational power. Research in this area thus far has been limited to the evaluation by researchers themselves on a small number of synthetic questions. This article reports on what we believe to be the first in situ evaluation of a cybersecurity Question Answering (QA) assistant. We also evaluate a prompt engineered to help the cybersecurity QA assistant generate more effective answers. The study involved a 10-day deployment of a cybersecurity QA assistant in the form of a Chrome extension. Collectively, participants (N=51) evaluated answers generated by the assistant to over 1,000 cybersecurity questions they submitted as part of their regular day-to-day activities. The results suggest that a majority of participants found the assistant useful and often took actions based on the answers they received. In particular, the study indicates that prompting successfully improved the effectiveness of answers and, in particular, the likelihood that users follow their recommendations (fraction of participants who actually followed the advice was 0.514 with prompting vs. 0.402 without prompting, p=4.61E-04), an impact on people’s actual behavior. We provide a detailed analysis of data collected in this study, discuss their implications, and outline next steps in the development and deployment of effective cybersecurity QA assistants that offer the promise of changing actual user behavior and of reducing human-related security incidents. 
    more » « less
  2. Human actions or lack thereof contribute to a large majority of cybersecurity incidents. Traditionally, when looking for advice on cybersecurity questions, people have turned to search engines or social sites like Reddit. The rapid adoption of chatbot technologies is offering a potentially more direct way of getting similar advice. Initial research suggests, however, that while chatbot answers to common cybersecurity questions tend to be fairly accurate, they may not be very effective as they often fall short on other desired qualities such as understandability, actionability, or motivational power. Research in this area thus far has been limited to the evaluation by researchers themselves on a small number of synthetic questions. This article reports on what we believe to be the first in situ evaluation of a cybersecurity Question Answering (QA) assistant. We also evaluate a prompt engineered to help the cybersecurity QA assistant generate more effective answers. The study involved a 10-day deployment of a cybersecurity QA assistant in the form of a Chrome extension. Collectively, participants (N=51) evaluated answers generated by the assistant to over 1,000 cybersecurity questions they submitted as part of their regular day-to-day activities. The results suggest that a majority of participants found the assistant useful and often took actions based on the answers they received. In particular, the study indicates that prompting successfully improved the effectiveness of answers and, in particular, the likelihood that users follow their recommendations (fraction ofparticipants who actually followed the advice was 0.514 with prompting vs. 0.402 without prompting, p=4.61E-04), an impacton people’s actual behavior. We provide a detailed analysis of data collected in this study, discuss their implications, and outline next steps in the development and deployment of effective cybersecurity QA assistants that offer the promise of changing actual user behavior and of reducing human-related security incidents. 
    more » « less
  3. Generative AI, particularly Large Language Models (LLMs), has revolutionized human-computer interaction by enabling the generation of nuanced, human-like text. This presents new opportunities, especially in enhancing explainability for AI systems like recommender systems, a crucial factor for fostering user trust and engagement. LLM-powered AI-Chatbots can be leveraged to provide personalized explanations for recommendations. Although users often find these chatbot explanations helpful, they may not fully comprehend the content. Our research focuses on assessing how well users comprehend these explanations and identifying gaps in understanding. We also explore the key behavioral differences between users who effectively understand AI-generated explanations and those who do not. We designed a three-phase user study with 17 participants to explore these dynamics. The findings indicate that the clarity and usefulness of the explanations are contingent on the user asking relevant follow-up questions and having a motivation to learn. Comprehension also varies significantly based on users’ educational backgrounds. 
    more » « less
  4. Despite their ability to answer complex questions, it is unclear whether generative chatbots should be considered experts in any domain. There are several important cognitive and metacognitive differences that separate human experts from generative chatbots. First, human experts’ domain knowledge is deep, efficiently structured, adaptive, and intuitive – whereas generative chatbots’ knowledge is shallow and inflexible, leading to errors that human experts would rarely make. Second, generative chatbots lack access to critical metacognitive capacities that allow humans to detect errors in their own thinking and communicate this information to others. Though generative chatbots may surpass human experts in the future – for now, the nature of their knowledge structures and metacognition prevent them from reaching true expertise. 
    more » « less
  5. As more users adopt VPNs for a variety of reasons, it is important to develop empirical knowledge of their needs and mental models of what a VPN offers. Moreover, studying VPN users alone is not enough because, by using a VPN, a user essentially transfers trust, say from their network provider, onto the VPN provider. To that end, we are the first to study the VPN ecosystem from both the users' and the providers' perspectives. In this paper, we conduct a quantitative survey of 1,252 VPN users in the U.S. and qualitative interviews of nine providers to answer several research questions regarding the motivations, needs, threat model, and mental model of users, and the key challenges and insights from VPN providers. We create novel insights by augmenting our multi-perspective results, and highlight cases where the user and provider perspectives are misaligned. Alarmingly, we find that users rely on and trust VPN review sites, but VPN providers shed light on how these sites are mostly motivated by money. Worryingly, we find that users have flawed mental models about the protection VPNs provide, and about data collected by VPNs. We present actionable recommendations for technologists and security and privacy advocates by identifying potential areas on which to focus efforts and improve the VPN ecosystem. 
    more » « less