skip to main content


Title: #lets-discuss: Analyzing student affect in course forums using emoji.
Emoji are commonly used in social media to convey attitudes and emotions. While popular, their use in educational contexts has been sparsely studied. This paper reports on the students’ use of emoji in an online course forum in which students annotate and discuss course material in the margins of the online textbook. For this study, instructors created 11 custom emoji-hashtag pairs that enabled students to quickly communicate affects and reactions in the forum that they experienced while interacting with the course material. Example reporting includes, inviting discussion about a topic, declaring a topic as interesting, or requesting assistance about a topic. We analyze emoji usage by over 1,800 students enrolled in multiple offerings of the same course across multiple academic terms. The data show that some emoji frequently appear together in posts associated with the same paragraphs, suggesting that students use the emoji in this way to communicating complex affective states. We explore the use of computational models for predicting emoji at the post level, even when posts are lacking emoji. This capability can allow instructors to infer information about students’ affective states during their ”at home” interactions with course readings. Finally, we show that partitioning the emoji into distinct groups, rather than trying to predict individual emoji, can be both of pedagogical value to instructors and improve the predictive performance of our approach using the BERT language model. Our procedure can be generalized to other courses and for the benefit of other instructors.  more » « less
Award ID(s):
1915724
NSF-PAR ID:
10374291
Author(s) / Creator(s):
; ; ; ; ; ;
Editor(s):
Mitrovic, A.; Bosch, N.
Date Published:
Journal Name:
Proceedings of the 15th International Conference on Educational Data Mining
Page Range / eLocation ID:
339–345
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Students who take an online course, such as a MOOC, use the course's discussion forum to ask questions or reach out to instructors when encountering an issue. However, reading and responding to students' questions is difficult to scale because of the time needed to consider each message. As a result, critical issues may be left unresolved, and students may lose the motivation to continue in the course. To help address this problem, we build predictive models that automatically determine the urgency of each forum post, so that these posts can be brought to instructors' attention. This paper goes beyond previous work by predicting not just a binary decision cut-off but a post's level of urgency on a 7-point scale. First, we train and cross-validate several models on an original data set of 3,503 posts from MOOCs at University of Pennsylvania. Second, to determine the generalizability of our models, we test their performance on a separate, previously published data set of 29,604 posts from MOOCs at Stanford University. While the previous work on post urgency used only one data set, we evaluated the prediction across different data sets and courses. The best-performing model was a support vector regressor trained on the Universal Sentence Encoder embeddings of the posts, achieving an RMSE of 1.1 on the training set and 1.4 on the test set. Understanding the urgency of forum posts enables instructors to focus their time more effectively and, as a result, better support student learning. 
    more » « less
  2. null (Ed.)
    Online forums are an integral part of modern day courses, but motivating students to participate in educationally beneficial discussions can be challenging. Our proposed solution is to initialize (or “seed”) a new course forum with comments from past instances of the same course that are intended to trigger discussion that is beneficial to learning. In this work, we develop methods for selecting high-quality seeds and evaluate their impact over one course instance of a 186-student biology class. We designed a scale for measuring the “seeding suitability” score of a given thread (an opening comment and its ensuing discussion). We then constructed a supervised machine learning (ML) model for predicting the seeding suitability score of a given thread. This model was evaluated in two ways: first, by comparing its performance to the expert opinion of the course instructors on test/holdout data; and second, by embedding it in a live course, where it was actively used to facilitate seeding by the course instructors. For each reading assignment in the course, we presented a ranked list of seeding recommendations to the course instructors, who could review the list and filter out seeds with inconsistent or malformed content. We then ran a randomized controlled study, in which one group of students was shown seeds that were recommended by the ML model, and another group was shown seeds that were recommended by an alternative model that ranked seeds purely by the length of discussion that was generated in previous course instances. We found that the group of students that received posts from either seeding model generated more discussion than a control group in the course that did not get seeded posts. Furthermore, students who received seeds selected by the ML-based model showed higher levels of engagement, as well as greater learning gains, than those who received seeds ranked by length of discussion. 
    more » « less
  3. Lynch, Collin F. ; Merceron, Agathe ; Desmarais, Michel ; Nkambou, Roger (Ed.)
    Discussion forums are the primary channel for social interaction and knowledge sharing in Massive Open Online Courses (MOOCs). Many researchers have analyzed social connections on MOOC discussion forums. However, to the best of our knowledge, there is little research that distinguishes between the types of connections students make based upon the content of their forum posts. We analyze this effect by distinguishing on- and off-topic posts and comparing their respective social networks. We then analyze how these types of posts and their social connections can be used to predict the students’ final course performance. Pursuant to this work we developed a binary classifier to identify on- and off- topic posts and applied our analysis with the hand-coded and predicted labels. We conclude that the post type does affect the relationship between the students and their closest neighbors or community members clustered communities and their closest neighbor to their learning outcomes. 
    more » « less
  4. This Innovate Practice full paper presents a cloud-based personalized learning lab platform. Personalized learning is gaining popularity in online computer science education due to its characteristics of pacing the learning progress and adapting the instructional approach to each individual learner from a diverse background. Among various instructional methods in computer science education, hands-on labs have unique requirements of understanding learner's behavior and assessing learner's performance for personalization. However, it is rarely addressed in existing research. In this paper, we propose a personalized learning platform called ThoTh Lab specifically designed for computer science hands-on labs in a cloud environment. ThoTh Lab can identify the learning style from student activities and adapt learning material accordingly. With the awareness of student learning styles, instructors are able to use techniques more suitable for the specific student, and hence, improve the speed and quality of the learning process. With that in mind, ThoTh Lab also provides student performance prediction, which allows the instructors to change the learning progress and take other measurements to help the students timely. For example, instructors may provide more detailed instructions to help slow starters, while assigning more challenging labs to those quick learners in the same class. To evaluate ThoTh Lab, we conducted an experiment and collected data from an upper-division cybersecurity class for undergraduate students at Arizona State University in the US. The results show that ThoTh Lab can identify learning style with reasonable accuracy. By leveraging the personalized lab platform for a senior level cybersecurity course, our lab-use study also shows that the presented solution improves students engagement with better understanding of lab assignments, spending more effort on hands-on projects, and thus greatly enhancing learning outcomes. 
    more » « less
  5. null (Ed.)
    The demands of engineering writing are much different from those of general writing, which students study from grade school through first-year composition. First, the content of engineering writing is both more specific and more complex [1]. As a second difference, not only do the types of audiences vary more in engineering but so does the audience’s level of knowledge about the content. Yet a third difference is that the expected level of precision in engineering writing is much higher [2]. Still a fourth difference is that the formats for engineering reports, which call for writing in sections and for incorporating illustrations and equations, are much more detailed than the double-space essays of first-year composition. Because many engineering students do not take a technical writing course until their junior or senior year, a gap exists between what undergraduates have learned to do in general writing courses and what those students are expected to produce in design courses and laboratory courses. While some engineering colleges such as the University of Michigan have bridged the gap with instruction about engineering writing in first-year design, a few such as the University of Wisconsin-Madison have done so with first-year English [4]. Still, a third group of schools such as Purdue have done so using an integration of these courses [5]. Unfortunately, many other engineering colleges have not bridged the gap in the first year. For instance, at Penn State, first-year design is not an option for teaching engineering writing because this course spans only one semester course and has no room for another major instructional topic. In addition, at this same institution, first-year composition is not an option because the English Department is adamant about having that course’s scope remain on general writing. Although a technical writing course in the junior or senior year should theoretically bridge the gap, not understanding the differences between general writing and engineering writing poses problems for engineering students who have yet taken technical writing. For instance, not understanding the organization of an engineering report can significantly pull down a report’s grade and lead students to assume that they are inherently weak at engineering writing [6]. Another problem is that engineering students who have not bridged the gap between general writing and engineering writing are at a disadvantage when writing emails and reports during a summer internship. To bridge this gap, we have created an online resource [7] that teaches students the essential differences between general writing and the writing done by engineers. At the heart of the resource are two web pages—one on writing reports and the other on writing professional emails. Each page consists of a series of short films that provide the essential differences between the two types of writing and a quiz to ensure comprehension of the films. In addition, students have links to model documents, while faculty have links to lesson plans. Using an NSF I-Corps approach [8], which is an educational version of how to build a start-up company [9], we have developed our web resource over the past six months. Specifically, we have tested value propositions through customer interviews of faculty and students in first-year courses in which the resource has been piloted. Using the results of those customer interviews, we have revised our two web pages. This paper presents the following highlights of this effort: (1) our customer discoveries about the gap between general writing and engineering writing, (2) the corresponding pivots that we made in the online resource to respond to those discoveries, and (3) the website usage statistics that show the effects of making those pivots 
    more » « less