Search for: All records

Creators/Authors contains: "Li, Junyi"

« Prev Next »

Total Resources

51

Resource Type
Conference Paper

47

Conference Proceeding

0

Dataset

0

Journal Article

4

Workshop Report

0

Availability
Full Text / Resource Available

44

Citation Only

7

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Multilingual Code Co-evolution using Large Language Models

https://doi.org/10.1145/3611643.3616350

Zhang, Jiyang ; Nie, Pengyu ; Li, Junyi Jessy ; Gligoric, Milos ( November 2023 , ACM)

Free, publicly-accessible full text available November 30, 2024
Counterfactual Probing for the Influence of Affect and Specificity on Intergroup Bias

Govindarajan, Venkata Subrahmanyan ; Beaver, David ; Mahowald, Kyle ; Li, Junyi Jessy ( July 2023 , Findings of the Association for Computational Linguistics: ACL 2023)

While existing work on studying bias in NLP focuses on negative or pejorative language use, Govindarajan et al. (2023) offer a revised framing of bias in terms of intergroup social context, and its effects on language behavior. In this paper, we investigate if two pragmatic features (specificity and affect) systematically vary in different intergroup contexts — thus connecting this new framing of bias to language output. Preliminary analysis finds modest correlations between specificity and affect of tweets with supervised intergroup relationship (IGR) labels. Counterfactual probing further reveals that while neural models finetuned for predicting IGR reliably use affect in classification, the model’s usage of specificity is inconclusive.
more » « less
Free, publicly-accessible full text available July 1, 2024
Summarizing, Simplifying, and Synthesizing Medical Evidence using GPT-3 (with Varying Success)

Shaib, Chantal ; Li, Millicent ; Joseph, Sebastian ; Marshall, Iain ; Li, Junyi Jessy ; Wallace, Byron ( July 2023 , Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers))

Large language models, particularly GPT-3, are able to produce high quality summaries of general domain news articles in few- and zero-shot settings. However, it is unclear if such models are similarly capable in more specialized domains such as biomedicine. In this paper we enlist domain experts (individuals with medical training) to evaluate summaries of biomedical articles generated by GPT-3, given no supervision. We consider both single- and multi-document settings. In the former, GPT-3 is tasked with generating regular and plain-language summaries of articles describing randomized controlled trials; in the latter, we assess the degree to which GPT-3 is able to synthesize evidence reported across a collection of articles. We design an annotation scheme for evaluating model outputs, with an emphasis on assessing the factual accuracy of generated summaries. We find that while GPT-3 is able to summarize and simplify single biomedical articles faithfully, it struggles to provide accurate aggregations of findings over multiple documents. We release all data, code, and annotations used in this work.
more » « less
Free, publicly-accessible full text available July 1, 2024
Discourse Analysis via Questions and Answers: Parsing Dependency Structures of Questions Under Discussion

Ko, Wei-Jen ; Wu, Yating ; Dalton, Cutter ; Srinivas, Dananjay ; Durrett, Greg ; Li, Junyi Jessy ( July 2023 , Findings of the Association for Computational Linguistics: ACL 2023)

Automatic discourse processing is bottlenecked by data: current discourse formalisms pose highly demanding annotation tasks involving large taxonomies of discourse relations, making them inaccessible to lay annotators. This work instead adopts the linguistic framework of Questions Under Discussion (QUD) for discourse analysis and seeks to derive QUD structures automatically. QUD views each sentence as an answer to a question triggered in prior context; thus, we characterize relationships between sentences as free-form questions, in contrast to exhaustive fine-grained taxonomies. We develop the first-of-its-kind QUD parser that derives a dependency structure of questions over full documents, trained using a large, crowdsourced question-answering dataset DCQA (Ko et al., 2022). Human evaluation results show that QUD dependency parsing is possible for language models trained with this crowdsourced, generalizable annotation scheme. We illustrate how our QUD structure is distinct from RST trees, and demonstrate the utility of QUD analysis in the context of document simplification. Our findings show that QUD parsing is an appealing alternative for automatic discourse processing.
more » « less
Free, publicly-accessible full text available July 1, 2024
Unsupervised Extractive Summarization of Emotion Triggers

Sosea, Tiberiu ; Zhan, Hongli ; Li, Junyi Jessy ; Caragea, Cornelia ( May 2023 , Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers))

Understanding what leads to emotions during large-scale crises is important as it can provide groundings for expressed emotions and subsequently improve the understanding of ongoing disasters. Recent approaches trained supervised models to both detect emotions and explain emotion triggers (events and appraisals) via abstractive summarization. However, obtaining timely and qualitative abstractive summaries is expensive and extremely time-consuming, requiring highly-trained expert annotators. In time-sensitive, high-stake contexts, this can block necessary responses. We instead pursue unsupervised systems that extract triggers from text. First, we introduce CovidET-EXT, augmenting (Zhan et al., 2022)’s abstractive dataset (in the context of the COVID-19 crisis) with extractive triggers. Second, we develop new unsupervised learning models that can jointly detect emotions and summarize their triggers. Our best approach, entitled Emotion-Aware Pagerank, incorporates emotion information from external sources combined with a language understanding module, and outperforms strong baselines. We release our data and code at https://github.com/tsosea2/CovidET-EXT.
more » « less
Free, publicly-accessible full text available May 1, 2024
Learning Deep Semantics for Test Completion

https://doi.org/10.1109/ICSE48619.2023.00178

Nie, Pengyu ; Banerjee, Rahul ; Li, Junyi Jessy ; Mooney, Raymond J. ; Gligoric, Milos ( May 2023 , International Conference on Software Engineering)

Free, publicly-accessible full text available May 1, 2024
How people talk about each other: Modeling Generalized Intergroup Bias and Emotion

Govindarajan, Venkata Subrahmanyan ; Atwell, Katherine ; Sinno, Barea ; Alikhani, Malihe ; Beaver, David I. ; Li, Junyi Jessy ( May 2023 , Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics)

Current studies of bias in NLP rely mainly on identifying (unwanted or negative) bias towards a specific demographic group. While this has led to progress recognizing and mitigating negative bias, and having a clear notion of the targeted group is necessary, it is not always practical. In this work we extrapolate to a broader notion of bias, rooted in social science and psychology literature. We move towards predicting interpersonal group relationship (IGR) - modeling the relationship between the speaker and the target in an utterance - using fine-grained interpersonal emotions as an anchor. We build and release a dataset of English tweets by US Congress members annotated for interpersonal emotion - the first of its kind, and ‘found supervision’ for IGR labels; our analyses show that subtle emotional signals are indicative of different biases. While humans can perform better than chance at identifying IGR given an utterance, we show that neural models perform much better; furthermore, a shared encoding between IGR and interpersonal perceived emotion enabled performance gains in both tasks.
more » « less
Free, publicly-accessible full text available May 1, 2024
FALTE: A Toolkit for Fine-grained Annotation for Long Text Evaluation

Goyal, Tanya ; Li, Junyi Jessy ; Durrett, Greg ( December 2022 , Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations)

A growing swath of NLP research is tackling problems related to generating long text, including tasks such as open-ended story generation, summarization, dialogue, and more. However, we currently lack appropriate tools to evaluate these long outputs of generation models: classic automatic metrics such as ROUGE have been shown to perform poorly, and newer learned metrics do not necessarily work well for all tasks and domains of text. Human rating and error analysis remains a crucial component for any evaluation of long text generation. In this paper, we introduce FALTE, a web-based annotation toolkit designed to address this shortcoming. Our tool allows researchers to collect fine-grained judgments of text quality from crowdworkers using an error taxonomy specific to the downstream task. Using the task interface, annotators can select and assign error labels to text span selections in an incremental paragraph-level annotation workflow. The latter functionality is designed to simplify the document-level task into smaller units and reduce cognitive load on the annotators. Our tool has previously been used to run a large-scale annotation study that evaluates the coherence of long generated summaries, demonstrating its utility.
more » « less
Full Text Available
SNaC: Coherence Error Detection for Narrative Summarization

Goyal, Tanya ; Li, Junyi Jessy ; Durrett, Greg ( December 2022 , Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing)

Progress in summarizing long texts is inhibited by the lack of appropriate evaluation frameworks. A long summary that appropriately covers the facets of that text must also present a coherent narrative, but current automatic and human evaluation methods fail to identify gaps in coherence. In this work, we introduce SNaC, a narrative coherence evaluation framework for fine-grained annotations of long summaries. We develop a taxonomy of coherence errors in generated narrative summaries and collect span-level annotations for 6.6k sentences across 150 book and movie summaries. Our work provides the first characterization of coherence errors generated by state-of-the-art summarization models and a protocol for eliciting coherence judgments from crowdworkers. Furthermore, we show that the collected annotations allow us to benchmark past work in coherence modeling and train a strong classifier for automatically localizing coherence errors in generated summaries. Finally, our SNaC framework can support future work in long document summarization and coherence evaluation, including improved summarization modeling and post-hoc summary correction.
more » « less
Full Text Available
Why Do You Feel This Way? Summarizing Triggers of Emotions in Social Media Posts

Zhan, Hongli ; Sosea, Tiberiu ; Caragea, Cornelia ; Li, Junyi Jessy ( December 2022 , Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing)

Crises such as the COVID-19 pandemic continuously threaten our world and emotionally affect billions of people worldwide in distinct ways. Understanding the triggers leading to people’s emotions is of crucial importance. Social media posts can be a good source of such analysis, yet these texts tend to be charged with multiple emotions, with triggers scattering across multiple sentences. This paper takes a novel angle, namely, emotion detection and trigger summarization, aiming to both detect perceived emotions in text, and summarize events and their appraisals that trigger each emotion. To support this goal, we introduce CovidET (Emotions and their Triggers during Covid-19), a dataset of ~1,900 English Reddit posts related to COVID-19, which contains manual annotations of perceived emotions and abstractive summaries of their triggers described in the post. We develop strong baselines to jointly detect emotions and summarize emotion triggers. Our analyses show that CovidET presents new challenges in emotion-specific summarization, as well as multi-emotion detection in long social media posts.
more » « less
Full Text Available

« Prev Next »