skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Hall, D"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Social media continues to have an impact on the trajectory of humanity. However, its introduction has also weaponized keyboards, allowing the abusive language normally reserved for in-person bullying to jump onto the screen, i.e., cyberbullying. Cyberbullying poses a significant threat to adolescents globally, affecting the mental health and well-being of many. A group that is particularly at risk is the LGBTQ+ community, as researchers have uncovered a strong correlation between identifying as LGBTQ+ and suffering from greater online harassment. Therefore, it is critical to develop machine learning models that can accurately discern cyberbullying incidents as they happen to LGBTQ+ members. The aim of this study is to compare the efficacy of several transformer models in identifying cyberbullying targeting LGBTQ+ individuals. We seek to determine the relative merits and demerits of these existing methods in addressing complex and subtle kinds of cyberbullying by assessing their effectiveness with real social media data. 
    more » « less
  2. Social media discourse involves people from different backgrounds, beliefs, and motives. Thus, often such discourse can devolve into toxic interactions. Generative Models, such as Llama and ChatGPT, have recently exploded in popularity due to their capabilities in zero-shot question-answering. Because these models are increasingly being used to ask questions of social significance, a crucial research question is whether they can understand social media dynamics. This work provides a critical analysis regarding generative LLM’s ability to understand language and dynamics in social contexts, particularly considering cyberbullying and anti-cyberbullying (posts aimed at reducing cyberbullying) interactions. Specifically, we compare and contrast the capabilities of different large language models (LLMs) to understand three key aspects of social dynamics: language, directionality, and the occurrence of bullying/anti-bullying messages. We found that while fine-tuned LLMs exhibit promising results in some social media understanding tasks (understanding directionality), they presented mixed results in others (proper paraphrasing and bullying/anti-bullying detection). We also found that fine-tuning and prompt engineering mechanisms can have positive effects in some tasks. We believe that a understanding of LLM’s capabilities is crucial to design future models that can be effectively used in social applications. 
    more » « less
  3. Anti-Asian prejudice increased during the COVID-19 pandemic, evidenced by a rise in physical attacks on individuals of Asian descent. Concurrently, as many governments enacted stay-at-home mandates, the spread of anti-Asian content increased in online spaces, including social media platforms such as Twitter. In the present study, we investigated temporal and geographic patterns in the prevalence of social media content relevant to anti-Asian prejudice within the U.S. and worldwide. Specifically, we used the Twitter Data Collection API to query over 13 million tweets posted during the first 15 months of the pandemic (i.e., from January 30, 2020 to April 30, 2021), for both negative (e.g., #kungflu) and positive (e.g., #stopAAPIhate) hashtags and keywords related to anti-Asian prejudice. Results of a range of exploratory and descriptive analyses offer novel insights. For instance, in the U.S., results from a burst analysis indicated that the prevalence of negative (anti-Asian) and positive (counter-hate) messages fluctuated over time in patterns that largely mirrored salient events relevant to COVID-19 (e.g., political tweets, highly-visible hate crimes targeting Asians). Other representative findings include geographic differences in the frequency of negative and positive keywords that shed light on the regions within the U.S. and the countries worldwide in which negative and positive messages were most frequent. Additional analyses revealed informative patterns in the prevalence of original tweets versus retweets, the co-occurrence of negative and positive content within a tweet, and fluctuations in content in relation to the number of new COVID-19 cases and reported COVID-related deaths. Together, 
    more » « less
  4. Social media has revolutionized communication, allowing people worldwide to connect and interact instantly. However, it has also led to increases in cyberbullying, which poses a significant threat to children and adolescents globally, affecting their mental health and well-being. It is critical to accurately detect the roles of individuals involved in cyberbullying incidents to effectively address the issue on a large scale. This study explores the use of machine learning models to detect the roles involved in cyberbullying interactions. After examining the AMiCA dataset and addressing class imbalance issues, we evaluate the performance of various models built with four underlying LLMs (i.e. BERT, RoBERTa, T5, and GPT-2) for role detection. Our analysis shows that oversampling techniques help improve model performance. The best model, a fine-tuned RoBERTa using oversampled data, achieved an overall F1 score of 83.5%, increasing to 89.3% after applying a prediction threshold. The top-2 F1 score without thresholding was 95.7%. Our method outperforms previously proposed models. After investigating the per-class model performance and confidence scores, we show that the models perform well in classes with more samples and less contextual confusion (e.g. Bystander Other), but struggle with classes with fewer samples (e.g. Bystander Assistant) and more contextual ambiguity (e.g. Harasser and Victim). This work highlights current strengths and limitations in the development of accurate models with limited data and complex scenarios. 
    more » « less
  5. Increased social media use has contributed to the greater prevalence of abusive, rude, and offensive textual comments. Machine learning models have been developed to detect toxic comments online, yet these models tend to show biases against users with marginalized or minority identities (e.g., females and African Americans). Established research in debiasing toxicity classifiers often (1) takes a static or batch approach, assuming that all information is available and then making a one-time decision; and (2) uses a generic strategy to mitigate different biases (e.g., gender and racial biases) that assumes the biases are independent of one another. However, in real scenarios, the input typically arrives as a sequence of comments/words over time instead of all at once. Thus, decisions based on partial information must be made while additional input is arriving. Moreover, social bias is complex by nature. Each type of bias is defined within its unique context, which, consistent with intersectionality theory within the social sciences, might be correlated with the contexts of other forms of bias. In this work, we consider debiasing toxicity detection as a sequential decision-making process where different biases can be interdependent. In particular, we study debiasing toxicity detection with two aims: (1) to examine whether different biases tend to correlate with each other; and (2) to investigate how to jointly mitigate these correlated biases in an interactive manner to minimize the total amount of bias. At the core of our approach is a framework built upon theories of sequential Markov Decision Processes that seeks to maximize the prediction accuracy and minimize the bias measures tailored to individual biases. Evaluations on two benchmark datasets empirically validate the hypothesis that biases tend to be correlated and corroborate the effectiveness of the proposed sequential debiasing strategy. 
    more » « less
  6. Cyberbullying has become increasingly prevalent, particularly on social media. There has also been a steady rise in cyberbullying research across a range of disciplines. Much of the empirical work from computer science has focused on developing machine learning models for cyberbullying detection. Whereas machine learning cyberbullying detection models can be improved by drawing on psychological theories and perspectives, there is also tremendous potential for machine learning models to contribute to a better understanding of psychological aspects of cyberbullying. In this paper, we discuss how machine learning models can yield novel insights about the nature and defining characteristics of cyberbullying and how machine learning approaches can be applied to help clinicians, families, and communities reduce cyberbullying. Specifically, we discuss the potential for machine learning models to shed light on the repetitive nature of cyberbullying, the imbalance of power between cyberbullies and their victims, and causal mechanisms that give rise to cyberbullying. We orient our discussion on emerging and future research directions, as well as the practical implications of machine learning cyberbullying detection models. 
    more » « less
  7. As online communication continues to become more prevalent, instances of cyberbullying have also become more common, particularly on social media sites. Previous research in this area has studied cyberbullying outcomes, predictors of cyberbullying victimization/perpetration, and computational detection models that rely on labeled datasets to identify the underlying patterns. However, there is a dearth of work examining the content of what is said when cyberbullying occurs and most of the available datasets include only basic la-bels (cyberbullying or not). This paper presents an annotated Instagram dataset with detailed labels about key cyberbullying properties, such as the content type, purpose, directionality, and co-occurrence with other phenomena, as well as demographic information about the individuals who performed the annotations. Additionally, results of an exploratory logistic regression analysis are reported to illustrate how new insights about cyberbullying and its automatic detection can be gained from this labeled dataset. 
    more » « less
  8. Abstract Discrete symmetries are spatially ubiquitous but are often hidden in internal states of systems where they can have especially profound consequences. In this work we create and verify exotic magnetic phases of atomic spinor Bose–Einstein condensates that, despite their continuous character and intrinsic spatial isotropy, exhibit complex discrete polytope symmetries in their topological defects. Using carefully tailored spinor rotations and microwave transitions, we engineer singular line defects whose quantization conditions, exchange statistics, and dynamics are fundamentally determined by these underlying symmetries. We show how filling the vortex line singularities with atoms in a variety of different phases leads to core structures that possess magnetic interfaces with rich combinations of discrete and continuous symmetries. Such defects, with their non-commutative properties, could provide unconventional realizations of quantum information and interferometry. 
    more » « less
  9. Prejudice and hate directed toward Asian individuals has increased in prevalence and salience during the COVID-19 pandemic, with notable rises in physical violence. Concurrently, as many governments enacted stay-at-home mandates, the spread of anti-Asian content increased in online spaces, including social media. In the present study, we investigated temporal and geographical patterns in social media content relevant to anti-Asian prejudice during the COVID-19 pandemic. Using the Twitter Data Collection API, we queried over 13 million tweets posted between January 30, 2020, and April 30, 2021, for both negative (e.g., #kungflu) and positive (e.g., #stopAAPIhate) hashtags and keywords related to anti-Asian prejudice. In a series of descriptive analyses, we found differences in the frequency of negative and positive keywords based on geographic location. Using burst detection, we also identified distinct increases in negative and positive content in relation to key political tweets and events. These largely exploratory analyses shed light on the role of social media in the expression and proliferation of prejudice as well as positive responses online. 
    more » « less