skip to main content


Title: How Developers and Tools Categorize Sentiment in Stack Overflow Questions - A Pilot Study
The paper presents results from a pilot questionnaire-based study on ten Stack Overflow (SO) questions. Eleven developers were tasked with determining if the SO question sentiment was positive, negative or neutral. The results from the questionnaire indicate that developers mostly rated the sentiment of SO questions as neutral, stating that they received little or no emotional feedback from the questions. Tools that were designed to analyze Software Engineering related texts (SentiStrength-SE, SentiCR, and Senti4SD) were on average more closely aligned with developer ratings for a majority of the questions than general purpose tools for detecting SO question sentiment. We discuss cases where tools and developer sentiment differ along with implications of the results. Overall, the sentiment tool output on the question title and body is more aligned with the developer rating than just the title alone. Since SO is a very common medium of technical exchange, we also report that adding code snippets, short titles, and multiple tags were top three features developers prefer in SO questions in order for it to be answered quickly.  more » « less
Award ID(s):
1855756
NSF-PAR ID:
10267563
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
2021 IEEE/ACM Sixth International Workshop on Emotion Awareness in Software Engineering (SEmotion)
Page Range / eLocation ID:
19 to 22
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The paper presents an eye tracking pilot study on understanding how developers read and assess sentiment in twenty-four GitHub pull requests containing emoji randomly selected from five different open source applications. Gaze data was collected on various elements of the pull request page in Google Chrome while the developers were tasked with determining perceived sentiment. The developer perceived sentiment was compared with sentiment output from five state-of-the-art sentiment analysis tools. SentiStrength-SE had the highest performance, with 55.56% of its predictions being agreed upon by study participants. On the other hand, Stanford CoreNLP fared the worst, with only 5.56% of its predictions matching that of the participants'. Gaze data shows the top three areas that developers looked at the most were the comment body, added lines of code, and username (the person writing the comment). The results also show high attention given to emoji in the pull request comment body compared to the rest of the comment text. These results can help provide additional guidelines on the pull request review process. 
    more » « less
  2. In March 2020, the global COVID-19 pandemic forced universities across the United States to immediately stop face-to-face activities and transition to virtual instruction. While this transition was not easy for anyone, the shift to online learning was especially difficult for STEM courses, particularly engineering, which has a strong practical/laboratory component. Additionally, underrepresented students (URMs) in engineering experienced a range of difficulties during this transition. The purpose of this paper is to highlight underrepresented engineering students’ experiences as a result of COVID-19. In particular, we aim to highlight stories shared by participants who indicated a desire to share their experience with their instructor. In order to better understand these experiences, research participants were asked to share a story, using the novel data collection platform SenseMaker, based on the following prompt: Imagine you are chatting with a friend or family member about the evolving COVID-19 crisis. Tell them about something you have experienced recently as an engineering student. Conducting a SenseMaker study involves four iterative steps: 1) Initiation is the process of designing signifiers, testing, and deploying the instrument; 2) Story Collection is the process of collecting data through narratives; 3) Sense-making is the process of exploring and analyzing patterns of the collection of narratives; and 4) Response is the process of amplifying positive stories and dampening negative stories to nudge the system to an adjacent possible (Van der Merwe et al. 2019). Unlike traditional surveys or other qualitative data collection methods, SenseMaker encourages participants to think more critically about the stories they share by inviting them to make sense of their story using a series of triads and dyads. After completing their narrative, participants were asked a series of triadic, dyadic, and sentiment-based multiple-choice questions (MCQ) relevant to their story. For one MCQ, in particular, participants were required to answer was “If you could do so without fear of judgment or retaliation, who would you share this story with?” and were given the following options: 1) Family 2) Instructor 3) Peers 4) Prefer not to answer 5) Other. A third of the participants indicated that they would share their story with their instructor. Therefore, we further explored this particular question. Additionally, this paper aims to highlight this subset of students whose primary motivation for their actions were based on Necessity. High-level qualitative findings from the data show that students valued Grit and Perseverance, recent experiences influenced their Sense of Purpose, and their decisions were majorly made based on Intuition. Chi-squared tests showed that there were not any significant differences between race and the desire to share with their instructor, however, there were significant differences when factoring in gender suggesting that gender has a large impact on the complexity of navigating school during this time. Lastly, ~50% of participants reported feeling negative or extremely negative about their experiences, ~30% reported feeling neutral, and ~20% reported feeling positive or extremely positive about their experiences. In the study, a total of 500 micro-narratives from underrepresented engineering students were collected from June – July 2020. Undergraduate and graduate students were recruited for participation through the researchers’ personal networks, social media, and through organizations like NSBE. Participants had the option to indicate who is able to read their stories 1) Everyone 2) Researchers Only, or 3) No one. This work presents qualitative stories of those who granted permission for everyone to read. 
    more » « less
  3. null (Ed.)
    Stack Overflow is commonly used by software developers to help solve problems they face while working on software tasks such as fixing bugs or building new features. Recent research has explored how the content of Stack Overflow posts affects attraction and how the reputation of users attracts more visitors. However, there is very little evidence on the effect that visual attractors and content quantity have on directing gaze toward parts of a post, and which parts hold the attention of a user longer. Moreover, little is known about how these attractors help developers (students and professionals) answer comprehension questions. This paper presents an eye tracking study on thirty developers constrained to reading only Stack Overflow posts while summarizing four open source methods or classes. Results indicate that on average paragraphs and code snippets were fixated upon most often and longest. When ranking pages by number of appearance of code blocks and paragraphs, we found that while the presence of more code blocks did not affect number of fixations, the presence of increasing numbers of plain text paragraphs significantly drove down the fixations on comments. SO posts that were looked at only by students had longer fixation times on code elements within the first ten fixations. We found that 16 developer summaries contained 5 or more meaningful terms from SO posts they viewed. We discuss how our observations of reading behavior could benefit how users structure their posts. 
    more » « less
  4. Abstract Purpose Social media users share their ideas, thoughts, and emotions with other users. However, it is not clear how online users would respond to new research outcomes. This study aims to predict the nature of the emotions expressed by Twitter users toward scientific publications. Additionally, we investigate what features of the research articles help in such prediction. Identifying the sentiments of research articles on social media will help scientists gauge a new societal impact of their research articles. Design/methodology/approach Several tools are used for sentiment analysis, so we applied five sentiment analysis tools to check which are suitable for capturing a tweet's sentiment value and decided to use NLTK VADER and TextBlob. We segregated the sentiment value into negative, positive, and neutral. We measure the mean and median of tweets’ sentiment value for research articles with more than one tweet. We next built machine learning models to predict the sentiments of tweets related to scientific publications and investigated the essential features that controlled the prediction models. Findings We found that the most important feature in all the models was the sentiment of the research article title followed by the author count. We observed that the tree-based models performed better than other classification models, with Random Forest achieving 89% accuracy for binary classification and 73% accuracy for three-label classification. Research limitations In this research, we used state-of-the-art sentiment analysis libraries. However, these libraries might vary at times in their sentiment prediction behavior. Tweet sentiment may be influenced by a multitude of circumstances and is not always immediately tied to the paper's details. In the future, we intend to broaden the scope of our research by employing word2vec models. Practical implications Many studies have focused on understanding the impact of science on scientists or how science communicators can improve their outcomes. Research in this area has relied on fewer and more limited measures, such as citations and user studies with small datasets. There is currently a critical need to find novel methods to quantify and evaluate the broader impact of research. This study will help scientists better comprehend the emotional impact of their work. Additionally, the value of understanding the public's interest and reactions helps science communicators identify effective ways to engage with the public and build positive connections between scientific communities and the public. Originality/value This study will extend work on public engagement with science, sociology of science, and computational social science. It will enable researchers to identify areas in which there is a gap between public and expert understanding and provide strategies by which this gap can be bridged. 
    more » « less
  5. null (Ed.)
    Virtual conversational assistants designed specifically for software engineers could have a huge impact on the time it takes for software engineers to get help. Research efforts are focusing on virtual assistants that support specific software development tasks such as bug repair and pair programming. In this paper, we study the use of online chat platforms as a resource towards collecting developer opinions that could potentially help in building opinion Q&A systems, as a specialized instance of virtual assistants and chatbots for software engineers. Opinion Q&A has a stronger presence in chats than in other developer communications, thus mining them can provide a valuable resource for developers in quickly getting insight about a specific development topic (e.g., What is the best Java library for parsing JSON?). We address the problem of opinion Q&A extraction by developing automatic identification of opinion-asking questions and extraction of participants’ answers from public online developer chats. We evaluate our automatic approaches on chats spanning six programming communities and two platforms. Our results show that a heuristic approach to opinion-asking questions works well (.87 precision), and a deep learning approach customized to the software domain outperforms heuristics-based, machine-learning-based and deep learning for answer extraction in community question answering. 
    more » « less