NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Do LLMs consider security? an empirical study on responses to programming questions

https://doi.org/10.1007/s10664-025-10658-6

Sajadi, Amirali; Le, Binh; Nguyen, Anh; Damevski, Kostadin; Chatterjee, Preetha (July 2025, Empirical Software Engineering)

Abstract The widespread adoption of conversational LLMs for software development has raised new security concerns regarding the safety of LLM-generated content. Our motivational study outlines ChatGPT’s potential in volunteering context-specific information to the developers, promoting safe coding practices. Motivated by this finding, we conduct a study to evaluate the degree of security awareness exhibited by three prominent LLMs: Claude 3, GPT-4, and Llama 3. We prompt these LLMs with Stack Overflow questions that contain vulnerable code to evaluate whether they merely provide answers to the questions or if they also warn users about the insecure code, thereby demonstrating a degree of security awareness. Further, we assess whether LLM responses provide information about the causes, exploits, and the potential fixes of the vulnerability, to help raise users’ awareness. Our findings show that all three models struggle to accurately detect and warn users about vulnerabilities, achieving a detection rate of only 12.6% to 40% across our datasets. We also observe that the LLMs tend to identify certain types of vulnerabilities related to sensitive information exposure and improper input neutralization much more frequently than other types, such as those involving external control of file names or paths. Furthermore, when LLMs do issue security warnings, they often provide more information on the causes, exploits, and fixes of vulnerabilities compared to Stack Overflow responses. Finally, we provide an in-depth discussion on the implications of our findings, and demonstrated a CLI-based prompting tool that can be used to produce more secure LLM responses.
more » « less
Free, publicly-accessible full text available July 1, 2026
Uncovering the Causes of Emotions in Software Developer Communication Using Zero-shot LLMs

https://doi.org/10.1145/3597503.3639223

Imran, Mia Mohammad; Chatterjee, Preetha; Damevski, Kostadin (April 2024, ACM)

Full Text Available
Shedding Light on Software Engineering-specific Metaphors and Idioms

https://doi.org/10.1145/3597503.3639585

Imran, Mia Mohammad; Chatterjee, Preetha; Damevski, Kostadin (April 2024, ACM)

Full Text Available
Incivility in Open Source Projects: A Comprehensive Annotated Dataset of Locked GitHub Issue Threads

https://doi.org/10.1145/3643991.3644887

Ehsani, Ramtin; Imran, Mia Mohammad; Zita, Robert; Damevski, Kostadin; Chatterjee, Preetha (April 2024, ACM)

Full Text Available
Automatic Extraction of Opinion-Based Q&A from Online Developer Chats

https://doi.org/10.1109/ICSE43902.2021.00115

Chatterjee, Preetha; Damevski, Kostadin; Pollock, Lori (May 2021, 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE))
null (Ed.)
Full Text Available
Automatically Identifying the Quality of Developer Chats for Post Hoc Use

https://doi.org/10.1145/3450503

Chatterjee, Preetha; Damevski, Kostadin; Kraft, Nicholas A.; Pollock, Lori (July 2021, ACM Transactions on Software Engineering and Methodology)
null (Ed.)
Software engineers are crowdsourcing answers to their everyday challenges on Q&A forums (e.g., Stack Overflow) and more recently in public chat communities such as Slack, IRC, and Gitter. Many software-related chat conversations contain valuable expert knowledge that is useful for both mining to improve programming support tools and for readers who did not participate in the original chat conversations. However, most chat platforms and communities do not contain built-in quality indicators (e.g., accepted answers, vote counts). Therefore, it is difficult to identify conversations that contain useful information for mining or reading, i.e., conversations of post hoc quality. In this article, we investigate automatically detecting developer conversations of post hoc quality from public chat channels. We first describe an analysis of 400 developer conversations that indicate potential characteristics of post hoc quality, followed by a machine learning-based approach for automatically identifying conversations of post hoc quality. Our evaluation of 2,000 annotated Slack conversations in four programming communities (python, clojure, elm, and racket) indicates that our approach can achieve precision of 0.82, recall of 0.90, F-measure of 0.86, and MCC of 0.57. To our knowledge, this is the first automated technique for detecting developer conversations of post hoc quality.
more » « less
Full Text Available
Automatic Extraction of Opinion-based Q&A from Online Developer Chats

Chatterjee, Preetha; Damesvski, Kostadin; Pollock, Lori (January 2021, Proceedings of the International Conference on Software Engineering)
null (Ed.)
Virtual conversational assistants designed specifically for software engineers could have a huge impact on the time it takes for software engineers to get help. Research efforts are focusing on virtual assistants that support specific software development tasks such as bug repair and pair programming. In this paper, we study the use of online chat platforms as a resource towards collecting developer opinions that could potentially help in building opinion Q&A systems, as a specialized instance of virtual assistants and chatbots for software engineers. Opinion Q&A has a stronger presence in chats than in other developer communications, thus mining them can provide a valuable resource for developers in quickly getting insight about a specific development topic (e.g., What is the best Java library for parsing JSON?). We address the problem of opinion Q&A extraction by developing automatic identification of opinion-asking questions and extraction of participants’ answers from public online developer chats. We evaluate our automatic approaches on chats spanning six programming communities and two platforms. Our results show that a heuristic approach to opinion-asking questions works well (.87 precision), and a deep learning approach customized to the software domain outperforms heuristics-based, machine-learning-based and deep learning for answer extraction in community question answering.
more » « less
Full Text Available
Extracting Archival-Quality Information from Software-Related Chats

https://doi.org/10.1145/3377812.3381391

Chatterjee, Preetha (January 2020, ICSE Doctoral Symposium)

Software developers are increasingly having conversations about software development via online chat services. Many of those chat communications contain valuable information, such as code descriptions, good programming practices, and causes of common errors/exceptions. However, the nature of chat community content is transient, as opposed to the archival nature of other developer communications such as email, bug reports and Q&A forums. As a result, important information and advice are lost over time. The focus of this dissertation is Extracting Archival Information from Software-Related Chats, specifically to (1) automatically identify conversations which contain archival-quality information, (2) accurately reduce the granularity of the information reported as archival information, and (3) conduct a case study to investigate how archival quality information extracted from chats compare to related posts in Q&A forums. Archiving knowledge from developer chats that could be used potentially in several applications such as: creating a new archival mechanism available to a given chat community, augmenting Q&A forums, or facilitating the mining of specific information and improving software maintenance tools.
more » « less
Full Text Available
Software-related Slack Chats with Disentangled Conversations

Chatterjee, Preetha; Damevski, Kostadin; Kraft, Nicholas A.; Pollock, Lori (January 2020, IEEE International Working Conference on Mining Software Repositories)

More than ever, developers are participating in public chat communities to ask and answer software development questions. With over ten million daily active users, Slack is one of the most popular chat platforms, hosting many active channels focused on software development technologies, e.g., python, react. Prior studies have shown that public Slack chat transcripts contain valuable information, which could provide support for improving automatic software maintenance tools or help researchers understand developer struggles or concerns. In this paper, we present a dataset of software-related chat conversations, curated for two years from three open Slack communities (python, clojure, elm). Our dataset consists of 38,955 conversations, 437,893 utterances, contributed by 12,171 users. We also share the code for a customized machine-learning based algorithm that automatically extracts (or disentangles) conversations from the downloaded chat transcripts.
more » « less
Full Text Available
Software-related Slack Chats with Disentangled Conversations

https://doi.org/10.1145/3379597.3387493

Chatterjee, Preetha; Damevski, Kostadin; Kraft, Nicholas A.; Pollock, Lori (January 2020, Mining Software Repositories (MSR))

More than ever, developers are participating in public chat communities to ask and answer software development questions. With over ten million daily active users, Slack is one of the most popular chat platforms, hosting many active channels focused on software development technologies, e.g., python, react. Prior studies have shown that public Slack chat transcripts contain valuable information, which could provide support for improving automatic software maintenance tools or help researchers understand developer struggles or concerns. In this paper, we present a dataset of software-related Q&A chat conversations, curated for two years from three open Slack communities (python, clojure, elm). Our dataset consists of 38,955 conversations, 437,893 utterances, contributed by 12,171 users. We also share the code for a customized machine-learning based algorithm that automatically extracts (or disentangles) conversations from the downloaded chat transcripts.
more » « less
Full Text Available

Search for: All records