skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Lee, Justin"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The ability to compare objects, scenes, or situations is crucial for effective decision-making and problem-solving in everyday life. For instance, comparing the freshness of apples enables better choices during grocery shopping, while comparing sofa designs helps optimize the aesthetics of our living space. Despite its significance, the comparative capability is largely unexplored in artificial general intelligence (AGI). In this paper, we introduce MLLM-COMPBENCH, a benchmark designed to evaluate the comparative reasoning capability of multimodal large language models (MLLMs). MLLM-COMPBENCH mines and pairs images through visually oriented questions covering eight dimensions of relative comparison: visual attribute, existence, state, emotion, temporality, spatiality, quantity, and quality. We curate a collection of around 40K image pairs using metadata from diverse vision datasets and CLIP similarity scores. These image pairs span a broad array of visual domains, including animals, fashion, sports, and both outdoor and indoor scenes. The questions are carefully crafted to discern relative characteristics between two images and are labeled by human annotators for accuracy and relevance. We use MLLM-COMPBENCH to evaluate recent MLLMs, including GPT-4V(ision), Gemini-Pro, and LLaVA-1.6. Our results reveal notable shortcomings in their comparative abilities. We believe MLLM-COMPBENCH not only sheds light on these limitations but also establishes a solid foundation for future enhancements in the comparative capability of MLLMs. 
    more » « less
    Free, publicly-accessible full text available December 15, 2025
  2. Free, publicly-accessible full text available December 12, 2025
  3. Recent advances have greatly increased the capabilities of large language models (LLMs), but our understanding of the models and their safety has not progressed as fast. In this paper we aim to understand LLMs deeper by studying their individual neurons. We build upon previous work showing large language models such as GPT-4 can be useful in explaining what each neuron in a language model does. Specifically, we analyze the effect of the prompt used to generate explanations and show that reformatting the explanation prompt in a more natural way can significantly improve neuron explanation quality and greatly reduce computational cost. We demonstrate the effects of our new prompts in three different ways, incorporating both automated and human evaluations. 
    more » « less
  4. Islamophobia, a negative predilection towards the Muslim community, is present on social media platforms. In addition to causing harm to victims, it also hurts the reputation of social media platforms that claim to provide a safe online environment for all users. The volume of social media content is impossible to be manually reviewed, thus, it is important to find automated solutions to combat hate speech on social media platforms. Machine learning approaches have been used in the literature as a way to automate hate speech detection. In this paper, we use deep learning techniques to detect Islamophobia over Reddit and topic modeling to analyze the content and reveal topics from comments identified as Islamophobic. Some topics we identified include the Islamic dress code, religious practices, marriage, and politics. To detect Islamophobia, we used deep learning models. The highest performance was achieved with BERTbase+CNN, with an F1-Score of 0.92. 
    more » « less
  5. null (Ed.)
  6. Bluetongue virus (BTV) is an arthropod-borne, segmented double-stranded RNA virus that can cause severe disease in both wild and domestic ruminants. BTV evolves via several key mechanisms, including the accumulation of mutations over time and the reassortment of genome segments.Additionally, BTV must maintain fitness in two disparate hosts, the insect vector and the ruminant. The specific features of viral adaptation in each host that permit host-switching are poorly characterized. Limited field studies and experimental work have alluded to the presence of these phenomena at work, but our understanding of the factors that drive or constrain BTV's genetic diversification remains incomplete. Current research leveraging novel approaches and whole genome sequencing applications promises to improve our understanding of BTV's evolution, ultimately contributing to the development of better predictive models and management strategies to reduce future impacts of bluetongue epizootics. 
    more » « less