NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Utilizing External Knowledge to Enhance Location Prediction for Twitter/X Users in Low Resource Settings

https://doi.org/10.1145/3673899

Liu, Yaguang; Singh, Lisa (July 2024, ACM Transactions on Spatial Algorithms and Systems)

Accurate estimates of user location are important for many online services, including event detection, disaster management, and determining public opinion. Neural network-based techniques have proven to be highly effective in predicting user location. However, these models typically require a large amount of labeled training data, which can be difficult to obtain in real-world scenarios. In this article, we present two approaches to tackle the issue of limited training data when predicting city level location. First, we consider a self-supervised approach that trains a state-level model without labeled data and then integrate this knowledge into the training dataset used for city-level predictions. Second, we explore the option of increasing the number of training examples by utilizing external resources to generatesynthetic users. Finally, we combine these two strategies, exploiting the benefits of both. We empirically evaluate our proposed techniques on multiple Twitter/X datasets and show that our models perform significantly better than the state-of-the-art with improvements of up to 6% for Acc@161 and 8% for F1 score.
more » « less
Full Text Available
Mitigating demographic bias of machine learning models on social media

https://doi.org/10.1145/3617694.3623244

Wang, Yanchen; Singh, Lisa (October 2023, ACM)

Full Text Available
Using topic-noise models to generate domain-specific topics across data sources

https://doi.org/10.1007/s10115-022-01805-2

Churchill, Rob; Singh, Lisa (May 2023, Knowledge and Information Systems)

Full Text Available
DeMis: Data-Efficient Misinformation Detection Using Reinforcement Learning

https://doi.org/10.1007/978-3-031-26390-3_14

Kawintiranon, Kornraphop; Singh, Lisa (March 2023, Joint European Conference on Machine Learning and Knowledge Discovery in Databases)
Amini, MR.; Canu, S.; Fischer, A.; Guns, T.; Kralj Novak, P.; Tsoumakas, G. (Ed.)
Full Text Available
Combining vs. Transferring Knowledge: Investigating Strategies for Improving Demographic Inference in Low Resource Settings

https://doi.org/10.1145/3539597.3570462

Liu, Yaguang; Singh, Lisa (February 2023, WSDM '23: Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining)

Full Text Available
Identifying High-Quality Training Data for Misinformation Detection [Identifying High-Quality Training Data for Misinformation Detection]

https://doi.org/10.5220/0012089000003541

Haber, Jaren; Kawintiranon, Kornraphop; Singh, Lisa; Chen, Alexander; Pizzo, Aidan; Pogrebivsky, Anna; Yang, Joyce (January 2023, Proceedings of the 12th International Conference on Data Science, Technology and Applications - DATA)

Full Text Available
Traditional and context-specific spam detection in low resource settings

https://doi.org/10.1007/s10994-022-06176-x

Kawintiranon, Kornraphop; Singh, Lisa; Budak, Ceren (July 2022, Machine Learning)

Full Text Available
A Guided Topic-Noise Model for Short Texts

https://doi.org/10.1145/3485447.3512007

Churchill, Robert; Singh, Lisa; Ryan, Rebecca; Davis-Kean, Pamela (April 2022, WWW '22: Proceedings of the ACM Web Conference 2022)

Researchers using social media data want to understand the discussions occurring in and about their respective fields. These domain experts often turn to topic models to help them see the entire landscape of the conversation, but unsupervised topic models often produce topic sets that miss topics experts expect or want to see. To solve this problem, we propose Guided Topic-Noise Model (GTM), a semi-supervised topic model designed with large domain-specific social media data sets in mind. The input to GTM is a set of topics that are of interest to the user and a small number of words or phrases that belong to those topics. These seed topics are used to guide the topic generation process, and can be augmented interactively, expanding the seed word list as the model provides new relevant words for different topics. GTM uses a novel initialization and a new sampling algorithm called Generalized Polya Urn (GPU) seed word sampling to produce a topic set that includes expanded seed topics, as well as new unsupervised topics. We demonstrate the robustness of GTM on open-ended responses from a public opinion survey and four domain-specific Twitter data sets.
more » « less
Full Text Available
The Evolution of Topic Modeling

https://doi.org/10.1145/3507900

Churchill, Rob; Singh, Lisa (January 2022, ACM Computing Surveys)

Topic models have been applied to everything from books to newspapers to social media posts in an effort to identify the most prevalent themes of a text corpus. We provide an in-depth analysis of unsupervised topic models from their inception to today. We trace the origins of different types of contemporary topic models, beginning in the 1990s, and we compare their proposed algorithms, as well as their different evaluation approaches. Throughout, we also describe settings in which topic models have worked well and areas where new research is needed, setting the stage for the next generation of topic models.
more » « less
Full Text Available
Students or Mechanical Turk: Who Are the More Reliable Social Media Data Labelers? [Students or Mechanical Turk: Who Are the More Reliable Social Media Data Labelers?]

https://doi.org/10.5220/0011278600003269

Singh, Lisa; Vanarsdall, Rebecca; Wang, Yanchen; Gresenz, Carole (January 2022, Proceedings of the 11th International Conference on Data Science, Technology and Applications - DATA)

Full Text Available

« Prev Next »

Search for: All records