Students who take an online course, such as a MOOC, use the course's discussion forum to ask questions or reach out to instructors when encountering an issue. However, reading and responding to students' questions is difficult to scale because of the time needed to consider each message. As a result, critical issues may be left unresolved, and students may lose the motivation to continue in the course. To help address this problem, we build predictive models that automatically determine the urgency of each forum post, so that these posts can be brought to instructors' attention. This paper goes beyond previous work by predicting not just a binary decision cut-off but a post's level of urgency on a 7-point scale. First, we train and cross-validate several models on an original data set of 3,503 posts from MOOCs at University of Pennsylvania. Second, to determine the generalizability of our models, we test their performance on a separate, previously published data set of 29,604 posts from MOOCs at Stanford University. While the previous work on post urgency used only one data set, we evaluated the prediction across different data sets and courses. The best-performing model was a support vector regressor trained on the Universal Sentence Encoder embeddings of the posts, achieving an RMSE of 1.1 on the training set and 1.4 on the test set. Understanding the urgency of forum posts enables instructors to focus their time more effectively and, as a result, better support student learning. 
                        more » 
                        « less   
                    
                            
                            Age Inference Using A Hierarchical Attention Neural Network
                        
                    
    
            While demographic attributes, such as age, gender, and location, have been extensively studied, most previous studies usually combine different sources of data, such as the user's biography, pictures, posts, and the user's network to obtain reasonable inference accuracies. However, it is not always practical to collect all those different forms of data. Therefore, in this paper, we consider methods for inferring age that only use Twitter posts (tweet text and emojis). We propose a hierarchical attention neural model that integrates independent linguistic knowledge gained from text and emojis when making a prediction. This hierarchical model is able to capture the intra-post relationship between these different post components, as well as the inter-post relationships of a user's posts. Our empirical evaluation using a data set generated from Wikidata demonstrates that our model achieves better performance than the state-of-the-art models, and still performs well when the number of posts per user is reduced in the training data set. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1934925
- PAR ID:
- 10351970
- Date Published:
- Journal Name:
- Proceedings of the 30th ACM International Conference on Information & Knowledge Management
- Page Range / eLocation ID:
- 3273 to 3277
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Hierarchical text classification, which aims to classify text documents into a given hierarchy, is an important task in many real-world applications. Recently, deep neural models are gaining increasing popularity for text classification due to their expressive power and minimum requirement for feature engineering. However, applying deep neural networks for hierarchical text classification remains challenging, because they heavily rely on a large amount of training data and meanwhile cannot easily determine appropriate levels of documents in the hierarchical setting. In this paper, we propose a weakly-supervised neural method for hierarchical text classification. Our method does not require a large amount of training data but requires only easy-to-provide weak supervision signals such as a few class-related documents or keywords. Our method effectively leverages such weak supervision signals to generate pseudo documents for model pre-training, and then performs self-training on real unlabeled data to iteratively refine the model. During the training process, our model features a hierarchical neural structure, which mimics the given hierarchy and is capable of determining the proper levels for documents with a blocking mechanism. Experiments on three datasets from different domains demonstrate the efficacy of our method compared with a comprehensive set of baselines.more » « less
- 
            Hashtags can greatly facilitate content navigation and improve user engagement in social media. Meaningful as it might be, recommending hashtags for photo sharing services such as Instagram and Pinterest remains a daunting task due to the following two reasons. On the endogenous side, posts in photo sharing services often contain both images and text, which are likely to be correlated with each other. Therefore, it is crucial to coherently model both image and text as well as the interaction between them. On the exogenous side, hashtags are generated by users and different users might come up with different tags for similar posts, due to their different preference and/or community effect. Therefore, it is highly desirable to characterize the users’ tagging habits. In this paper, we propose an integral and effective hashtag recommendation approach for photo sharing services. In particular, the proposed approach considers both the endogenous and exogenous effects by a content modeling module and a habit modeling module, respectively. For the content modeling module, we adopt the parallel co-attention mechanism to coherently model both image and text as well as the interaction between them; for the habit modeling module, we introduce an external memory unit to characterize the historical tagging habit of each user. The overall hashtag recommendations are generated on the basis of both the post features from the content modeling module and the habit influences from the habit modeling module. We evaluate the proposed approach on real Instagram data. The experimental results demonstrate that the proposed approach significantly outperforms the state-of-theart methods in terms of recommendation accuracy, and that both content modeling and habit modeling contribute significantly to the overall recommendation accuracy.more » « less
- 
            Al-Nofaie, H (Ed.)Prior research has demonstrated relationships between personality traits of social media users and the language used in their posts. Few studies have examined whether there are relationships between personality traits of users and how they use emojis in their social media posts. Emojis are digital pictographs used to express ideas and emotions. There are thousands of emojis, which depict faces with expressions, objects, animals, and activities. We conducted a study with two samples (n = 76 andn = 245) in which we examined how emoji use on X (formerly Twitter) related to users’ personality traits and language use in posts. Personality traits were assessed from participants in an online survey. With participants’ consent, we analyzed word usage in posts. Word frequencies were calculated using the Linguistic Inquiry Word Count (LIWC). In both samples, the results showed that those who used the most emojis had the lowest levels of openness to experience. Emoji use was unrelated to the other personality traits. In sample 1, emoji use was also related to use of words related to family, positive emotion, and sadness and less frequent use of articles and words related to insight. In sample 2, more frequent use of emojis in posts was related to more frequent use ofyoupronouns,Ipronouns, and more frequent use of negative function words and words related to time. The results support the view that social media users’ characteristics may be gleaned from the content of their social media posts.more » « less
- 
            Emojis have quickly become a universal language that is used by worldwide users, for everyday tasks, across language barriers, and in different apps and platforms. The prevalence of emojis has quickly attracted great attentions from various research communities such as natural language processing, Web mining, ubiquitous computing, and human-computer interaction, as well as other disciplines including social science, arts, psychology, and linguistics. This talk summarizes the recent efforts made by my research group and our collaborators on analyzing large-scale emoji data.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    