Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Accurate estimates of user location are important for many online services, including event detection, disaster management, and determining public opinion. Neural network-based techniques have proven to be highly effective in predicting user location. However, these models typically require a large amount of labeled training data, which can be difficult to obtain in real-world scenarios. In this article, we present two approaches to tackle the issue of limited training data when predicting city level location. First, we consider a self-supervised approach that trains a state-level model without labeled data and then integrate this knowledge into the training dataset used for city-level predictions. Second, we explore the option of increasing the number of training examples by utilizing external resources to generatesynthetic users. Finally, we combine these two strategies, exploiting the benefits of both. We empirically evaluate our proposed techniques on multiple Twitter/X datasets and show that our models perform significantly better than the state-of-the-art with improvements of up to 6% for Acc@161 and 8% for F1 score.more » « less
- 
            Intermedia agenda setting (IAS) theory suggests that different news sources can influence each other's agenda. While this theory has been well-established in existing literature, whether it still holds in today's high-choice media environment, which includes news producers of different credibility and ideology dispositions, is an open question. Through two case studies--the 2016 and 2020 U.S. presidential elections--we show that media are still largely aligned, especially in broad topics they choose to cover, and that the level of alignment along the credibility dimension is comparable to that along the ideology dimension. Furthermore, we find that the coverage of the Republican candidate is better aligned across different media types than that of the Democratic candidate, and that media divergence has increased along both dimensions from 2016 to 2020. Finally, we demonstrate that high-credibility media still plays a dominant role in the IAS process, yet with a cautious warning of its declining IAS power for the Democratic candidate over the course of four years.more » « less
- 
            While demographic attributes, such as age, gender, and location, have been extensively studied, most previous studies usually combine different sources of data, such as the user's biography, pictures, posts, and the user's network to obtain reasonable inference accuracies. However, it is not always practical to collect all those different forms of data. Therefore, in this paper, we consider methods for inferring age that only use Twitter posts (tweet text and emojis). We propose a hierarchical attention neural model that integrates independent linguistic knowledge gained from text and emojis when making a prediction. This hierarchical model is able to capture the intra-post relationship between these different post components, as well as the inter-post relationships of a user's posts. Our empirical evaluation using a data set generated from Wikidata demonstrates that our model achieves better performance than the state-of-the-art models, and still performs well when the number of posts per user is reduced in the training data set.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
