Human actions or lack thereof contribute to a large majority of cybersecurity incidents. Traditionally, when looking for advice on cybersecurity questions, people have turned to search engines or social sites like Reddit. The rapid adoption of chatbot technologies is offering a potentially more direct way of getting similar advice. Initial research suggests, however, that while chatbot answers to common cybersecurity questions tend to be fairly accurate, they may not be very effective as they often fall short on other desired qualities such as understandability, actionability, or motivational power. Research in this area thus far has been limited to the evaluation by researchers themselves on a small number of synthetic questions. This article reports on what we believe to be the first in situ evaluation of a cybersecurity Question Answering (QA) assistant. We also evaluate a prompt engineered to help the cybersecurity QA assistant generate more effective answers. The study involved a 10-day deployment of a cybersecurity QA assistant in the form of a Chrome extension. Collectively, participants (N=51) evaluated answers generated by the assistant to over 1,000 cybersecurity questions they submitted as part of their regular day-to-day activities. The results suggest that a majority of participants found the assistant useful and often took actions based on the answers they received. In particular, the study indicates that prompting successfully improved the effectiveness of answers and, in particular, the likelihood that users follow their recommendations (fraction of participants who actually followed the advice was 0.514 with prompting vs. 0.402 without prompting, p=4.61E-04), an impact on people’s actual behavior. We provide a detailed analysis of data collected in this study, discuss their implications, and outline next steps in the development and deployment of effective cybersecurity QA assistants that offer the promise of changing actual user behavior and of reducing human-related security incidents.
more »
« less
Generating Effective Answers to People’s Everyday Cybersecurity Questions: An Initial Study
Human users are often the weakest link in cybersecurity, with a large percentage of security breaches attributed to some kind of human error. When confronted with everyday cybersecurity questions - or any other question for that matter, users tend to turn to their search engines, online forums, and, recently, chatbots. We report on a study on the effectiveness of answers generated by two popular chatbots to an initial set of questions related to typical cybersecurity challenges faced by users (e.g., phishing, use of VPN, multi-factor authentication). The study does not only look at the accuracy of the answers generated by the chatbots but also at whether these answers are understandable, whether they are likely to motivate users to follow any provided recommendations, and whether these recommendations are actionable. Surprisingly enough, this initial study suggests that state-of-the-art chatbots are already reasonably good at providing accurate answers to common cybersecurity questions. Yet the study also suggests that the chatbots are not very effective when it comes to generating answers that are relevant, actionable, and, most importantly, likely to motivate users to heed their recommendations. The study proceeds with the design and evaluation of prompt engineering techniques intended to improve the effectiveness of answers generated by the chatbots. Initial results suggest that it is possible to improve the effectiveness of answers and, in particular, their likelihood of motivating users to heed recommendations, and their ability to act upon these recommendations without diminishing their accuracy. We discuss the implications of these initial results and plans for future work in this area.
more »
« less
- Award ID(s):
- 1914486
- PAR ID:
- 10595733
- Publisher / Repository:
- Springer Nature Singapore
- Date Published:
- ISBN:
- 978-981-96-0576-7
- Page Range / eLocation ID:
- 363 to 379
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Human actions or lack thereof contribute to a large majority of cybersecurity incidents. Traditionally, when looking for advice on cybersecurity questions, people have turned to search engines or social sites like Reddit. The rapid adoption of chatbot technologies is offering a potentially more direct way of getting similar advice. Initial research suggests, however, that while chatbot answers to common cybersecurity questions tend to be fairly accurate, they may not be very effective as they often fall short on other desired qualities such as understandability, actionability, or motivational power. Research in this area thus far has been limited to the evaluation by researchers themselves on a small number of synthetic questions. This article reports on what we believe to be the first in situ evaluation of a cybersecurity Question Answering (QA) assistant. We also evaluate a prompt engineered to help the cybersecurity QA assistant generate more effective answers. The study involved a 10-day deployment of a cybersecurity QA assistant in the form of a Chrome extension. Collectively, participants (N=51) evaluated answers generated by the assistant to over 1,000 cybersecurity questions they submitted as part of their regular day-to-day activities. The results suggest that a majority of participants found the assistant useful and often took actions based on the answers they received. In particular, the study indicates that prompting successfully improved the effectiveness of answers and, in particular, the likelihood that users follow their recommendations (fraction ofparticipants who actually followed the advice was 0.514 with prompting vs. 0.402 without prompting, p=4.61E-04), an impacton people’s actual behavior. We provide a detailed analysis of data collected in this study, discuss their implications, and outline next steps in the development and deployment of effective cybersecurity QA assistants that offer the promise of changing actual user behavior and of reducing human-related security incidents.more » « less
-
Generative AI, particularly Large Language Models (LLMs), has revolutionized human-computer interaction by enabling the generation of nuanced, human-like text. This presents new opportunities, especially in enhancing explainability for AI systems like recommender systems, a crucial factor for fostering user trust and engagement. LLM-powered AI-Chatbots can be leveraged to provide personalized explanations for recommendations. Although users often find these chatbot explanations helpful, they may not fully comprehend the content. Our research focuses on assessing how well users comprehend these explanations and identifying gaps in understanding. We also explore the key behavioral differences between users who effectively understand AI-generated explanations and those who do not. We designed a three-phase user study with 17 participants to explore these dynamics. The findings indicate that the clarity and usefulness of the explanations are contingent on the user asking relevant follow-up questions and having a motivation to learn. Comprehension also varies significantly based on users’ educational backgrounds.more » « less
-
Despite their ability to answer complex questions, it is unclear whether generative chatbots should be considered experts in any domain. There are several important cognitive and metacognitive differences that separate human experts from generative chatbots. First, human experts’ domain knowledge is deep, efficiently structured, adaptive, and intuitive – whereas generative chatbots’ knowledge is shallow and inflexible, leading to errors that human experts would rarely make. Second, generative chatbots lack access to critical metacognitive capacities that allow humans to detect errors in their own thinking and communicate this information to others. Though generative chatbots may surpass human experts in the future – for now, the nature of their knowledge structures and metacognition prevent them from reaching true expertise.more » « less
-
We report on the status of our Cybersecurity Assess- ment Tools (CATS) project that is creating and val- idating a concept inventory for cybersecurity, which assesses the quality of instruction of any first course in cybersecurity. In fall 2014, we carried out a Del- phi process that identified core concepts of cyber- security. In spring 2016, we interviewed twenty-six students to uncover their understandings and mis- conceptions about these concepts. In fall 2016, we generated our first assessment tool–a draft Cyberse- curity Concept Inventory (CCI), comprising approx- imately thirty multiple-choice questions. Each ques- tion targets a concept; incorrect answers are based on observed misconceptions from the interviews. This year we are validating the draft CCI using cognitive interviews, expert reviews, and psychometric testing. In this paper, we highlight our progress to date in developing the CCI. The CATS project provides infrastructure for a rig- orous evidence-based improvement of cybersecurity education. The CCI permits comparisons of different instructional methods by assessing how well students learned the core concepts of the field (especially ad- versarial thinking), where instructional methods re- fer to how material is taught (e.g., lab-based, case- studies, collaborative, competitions, gaming). Specif- ically, the CCI is a tool that will enable researchers to scientifically quantify and measure the effect of their approaches to, and interventions in, cybersecurity ed- ucation.more » « less
An official website of the United States government

