Social media has become a powerful and efficient platform for information diffusion. The increasing pervasiveness of social media use, however, has brought about the problems of fraudulent accounts that are intended to diffuse misinformation or malicious contents. Twitter recently released comprehensive archives of fraudulent tweets that are possibly connected to a propaganda effort of Internet Research Agency (IRA) on the 2016 U.S. presidential election. To understand information diffusion in fraudulent networks, we analyze structural properties of the IRA retweet network, and develop deep neural network models to detect fraudulent tweets. The structure analysis reveals key characteristics of the fraudulent network. The experiment results demonstrate the superior performance of the deep learning technique to a traditional classification method in detecting fraudulent tweets. The findings have potential implications for curbing online misinformation. 
                        more » 
                        « less   
                    
                            
                            Inferring #MeToo Experience Tweets using Classic and Neural Models [Inferring #MeToo Experience Tweets using Classic and Neural Models]
                        
                    
    
            The #MeToo movement is one of several calls for social change to gain traction on Twitter in the past decade. The movement went viral after prominent individuals shared their experiences, and much of its power continues to be derived from experience sharing. Because millions of #MeToo tweets are published every year, it is important to accurately identify experience-related tweets. Therefore, we propose a new learning task and compare the effectiveness of classic machine learning models, ensemble models, and a neural network model that incorporates a pre-trained language model to reduce the impact of feature sparsity. We find that even with limited training data, the neural network model outperforms the classic and ensemble classifiers. Finally, we analyze the experience-related conversation in English during the first year of the #MeToo movement and determine that experience tweets represent a sizable minority of the conversation and are moderately correlated to major events. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10351518
- Date Published:
- Journal Name:
- Proceedings of the 11th International Conference on Data Science, Technology and Applications - DATA
- Page Range / eLocation ID:
- 107 to 117
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Determination of quality and reliability of information found in social media have been subjects of study by sever researchers. One set of solution may not work in all cases. This paper presents a method to estimate the slant of tweets related to a topic. The general approach followed is to construct labeled data from tweets and use supervised learning to build predictive models. Results obtained from two datasets are compared against OTC model and a CNN based model.more » « less
- 
            Effectively filtering and categorizing the large volume of user-generated content on social media during disaster events can help emergency management and disaster response prioritize their resources. Deep learning approaches, including recurrent neural networks and transformer-based models, have been previously used for this purpose. Capsule Neural Networks (CapsNets), initially proposed for image classification, have been proven to be useful for text analysis as well. However, to the best of our knowledge, CapsNets have not been used for classifying crisis-related messages, and have not been extensively compared with state-of-the-art transformer-based models, such as BERT. Therefore, in this study, we performed a thorough comparison between CapsNet models, state-of-the-art BERT models and two popular recurrent neural network models that have been successfully used for tweet classification, specifically, LSTM and Bi-LSTM models, on the task of classifying crisis tweets both in terms of their informativeness (binary classification), as well as their humanitarian content (multi-class classification). For this purpose, we used several benchmark datasets for crisis tweet classification, namely CrisisBench, CrisisNLP and CrisisLex. Experimental results show that the performance of the CapsNet models is on a par with that of LSTM and Bi-LSTM models for all metrics considered, while the performance obtained with BERT models have surpassed the performance of the other three models across different datasets and classes for both classification tasks, and thus BERT could be considered the best overall model for classifying crisis tweets.more » « less
- 
            The Digital Library Research Laboratory (DLRL) has collected over 3.5 billion tweets on different events for the Coordinated, Behaviorally-Aware Recovery for Transportation and Power Disruptions (CBAR-tpd), the Integrated Digital Event Archiving and Library (IDEAL), and the Global Event Trend Archive Research (GETAR) projects. The tweet collection topics include heart attack, solar eclipse, terrorism, etc. There are several collections on naturally occurring events such as hurricanes, floods, and solar eclipses. Such naturally occurring events are distributed across space and time. It would be beneficial to researchers if we can perform a spatial-temporal analysis to test some hypotheses, and to find any trends that tweets would reveal for such events. I apply an existing algorithm to detect locations from tweets by modifying it to work better with the type of datasets I work with. I use the time captured in tweets and also identify the tense of the sentences in tweets to perform the temporal analysis. I build a rule-based model for obtaining the tense of a tweet. The results from these two algorithms are merged to analyze naturally occurring moving events such as solar eclipses and hurricanes. Using the spatial-temporal information from tweets, I study if tweets can be a relevant source of information in understanding the movement of the event. I create visualizations to compare the actual path of the event with the information extracted by my algorithms. After examining the results from the analysis, I noted that Twitter can be a reliable source to identify places affected by moving events almost immediately. The locations obtained are at a more detailed level than in news-wires. We can also identify the time that an event affected a particular region by date.more » « less
- 
            Although pain is widely recognized to be a multidimensional experience, it is typically measured by unidimensional patient self-reported visual analog scale (VAS). However, self-reported pain is subjective, difficult to interpret and sometimes impossible to obtain. Machine learning models have been developed to automatically recognize pain at both the frame level and sequence (or video) level. Many methods use or learn facial action units (AUs) defined by the Facial Action Coding System (FACS) for describing facial expressions with muscle movement. In this paper, we analyze the relationship between sequence-level multidimensional pain measurements and frame-level AUs and an AU derived pain-related measure, the Prkachin and Solomon Pain Intensity (PSPI). We study methods that learn sequence-level metrics from frame-level metrics. Specifically, we explore an extended multitask learning model to predict VAS from human-labeled AUs with the help of other sequence-level pain measurements during training. This model consists of two parts: a multitask learning neural network model to predict multidimensional pain scores, and an ensemble learning model to linearly combine the multidimensional pain scores to best approximate VAS. Starting from human-labeled AUs, the model achieves a mean absolute error (MAE) on VAS of 1.73. It outperforms provided human sequence-level estimates which have an MAE of 1.76. Combining our machine learning model with the human estimates gives the best performance of MAE on VAS of 1.48.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    