NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

MaskPure: Improving Defense Against Text Adversaries with Stochastic Purification

Gietz, Harrison; Kalita, Jugal (June 2024, International Conference on Natural Language & Information Systems)

The improvement of language model robustness, including successful defense against adversarial attacks, remains an open problem. In computer vision settings, the stochastic noising and de-noising pro- cess provided by diffusion models has proven useful for purifying input images, thus improving model robustness against adversarial attacks. Similarly, some initial work has explored the use of random noising and de-noising to mitigate adversarial attacks in an NLP setting, but im- proving the quality and efficiency of these methods is necessary for them to remain competitive. We extend upon methods of input text purifica- tion that are inspired by diffusion processes, which randomly mask and refill portions of the input text before classification. Our novel method, MaskPure, exceeds or matches robustness compared to other contempo- rary defenses, while also requiring no adversarial classifier training and without assuming knowledge of the attack type. In addition, we show that MaskPure is provably certifiably robust. To our knowledge, MaskPure is the first stochastic-purification method with demonstrated success against both character-level and word-level attacks, indicating the gen- eralizable and promising nature of stochastic denoising defenses. In sum- mary: the MaskPure algorithm bridges literature on the current strongest certifiable and empirical adversarial defense methods, showing that both theoretical and practical robustness can be obtained together. Code is available on GitHub.
more » « less
Full Text Available
Language Model Sentence Completion with a Parser-Driven Rhetorical Control Method

Zingale, Joshua; Kalita, Jugal (March 2024, aclanthology.org)

Controlled text generation (CTG) seeks to guide large language model (LLM) output to produce text that conforms to desired criteria. The current study presents a novel CTG al- gorithm that enforces adherence toward spe- cific rhetorical relations in an LLM sentence- completion context by a parser-driven decoding scheme that requires no model fine-tuning. The method is validated both with automatic and human evaluation. The code is accessible on GitHub.
more » « less
Full Text Available
Action-Item-Driven Summarization of Long Meeting Transcripts

Golia, Logan; Kalita, Jugal (December 2023, 7th International Conference on Natural Language Processing and Information Retrieval, Seoul, South Korea, December 15-17, 2023)

The increased prevalence of online meetings has significantly en- hanced the practicality of a model that can automatically generate the summary of a given meeting. This paper introduces a novel and effective approach to automate the generation of meeting sum- maries. Current approaches to this problem generate general and basic summaries, considering the meeting simply as a long dia- logue. However, our novel algorithms can generate abstractive meet- ing summaries that are driven by the action items contained in the meeting transcript. This is done by recursively generating sum- maries and employing our action-item extraction algorithm for each section of the meeting in parallel. All of these sectional sum- maries are then combined and summarized together to create a coherent and action-item-driven summary. In addition, this paper introduces three novel methods for dividing up long transcripts into topic-based sections to improve the time efficiency of our al- gorithm, as well as to resolve the issue of large language models (LLMs) forgetting long-term dependencies. Our pipeline achieved a BERTScore of 64.98 across the AMI corpus, which is an approxi- mately 4.98% increase from the current state-of-the-art result pro- duced by a fine-tuned BART (Bidirectional and Auto-Regressive Transformers) model.
more » « less
Full Text Available
Action-Item-Driven Summarization of Long Meeting Transcripts

Golia, Logan; Kalita, Jugal (December 2023, ACM Digital Library)

The increased prevalence of online meetings has significantly en- hanced the practicality of a model that can automatically generate the summary of a given meeting. This paper introduces a novel and effective approach to automate the generation of meeting sum- maries. Current approaches to this problem generate general and basic summaries, considering the meeting simply as a long dialogue. However, our novel algorithms can generate abstractive meeting summaries that are driven by the action items contained in the meet- ing transcript. This is done by recursively generating summaries and employing our action-item extraction algorithm for each sec- tion of the meeting in parallel. All of these sectional summaries are then combined and summarized together to create a coherent and action-item-driven summary. In addition, this paper introduces three novel methods for dividing up long transcripts into topic- based sections to improve the time efficiency of our algorithm, as well as to resolve the issue of large language models (LLMs) forget- ting long-term dependencies. Our pipeline achieved a BERTScore of 64.98 across the AMI corpus, which is an approximately 4.98% increase from the current state-of-the-art result produced by a fine-tuned BART (Bidirectional and Auto-Regressive Transformers) model.
more » « less
Full Text Available
Privacy-Preserving Trust Management For Vehicular Communications and Federated Learning

https://doi.org/10.1109/SVCC56964.2023.10165137

Byunx, SangHyun; Sarker, Arijet; Lew, Ken; Kalita, Jugal; Chang, Sang-Yoon (May 2023, IEEE)
HiC-GNN: A generalizable model for 3D chromosome reconstruction using graph convolutional neural networks

https://doi.org/10.1016/j.csbj.2022.12.051

Hovenga, Van; Kalita, Jugal; Oluwadare, Oluwatosin (January 2023, Computational and Structural Biotechnology Journal)

Full Text Available
Exploring Sentence Vector Spaces through Automatic Summarization

Templeton, Adly; Kalita, Jugal (December 2018, International Conference on Machine Learning Applications (ICMLA), Orlando.)

Given vector representations for individual words, it is necessary to compute vector representations of sentences for many applications in a compositional manner, often using artificial neural networks. Relatively little work has explored the internal structure and properties of such sentence vectors. In this paper, we explore the properties of sentence vectors in the context of automatic summarization. In particular, we show that cosine similarity between sentence vectors and document vectors is strongly correlated with sentence importance and that vector semantics can identify and correct gaps between the sentences chosen so far and the document. In addition, we identify specific dimensions which are linked to effective summaries. To our knowledge, this is the first time specific dimensions of sentence embeddings have been connected to sentence properties. We also compare the features of different methods of sentence embeddings. Many of these insights have applications in uses of sentence embeddings far beyond summarization.
more » « less
Full Text Available
ParallelAttentionMechanismsinNeuralMachine Translation

Medina, Julian; Kalita, Jugal (December 2018, International Conference on Machine Learning Applications)

Recent papers in neural machine translation have proposed the strict use of attention mechanisms over previous stan- dards such as recurrent and convolutional neural networks (RNNs and CNNs). We propose that by running traditionally stacked encoding branches from encoder-decoder attention- focused architectures in parallel, that even more sequential operations can be removed from the model and thereby de- crease training time. In particular, we modify the recently published attention-based architecture called Transformer by Google, by replacing sequential attention modules with par- allel ones, reducing the amount of training time and substan- tially improving BLEU scores at the same time. Experiments over the English to German and English to French translation tasks show that our model establishes a new state of the art.
more » « less
Full Text Available
Abstractive Summarization Using Attentive Neural Techniques

Krantz, Jacob; Kalita, Jugal (December 2018, International Conference on Natural Language Processing)

In a world of proliferating data, the abil- ity to rapidly summarize text is grow- ing in importance. Automatic summariza- tion of text can be thought of as a se- quence to sequence problem. Another area of natural language processing that solves a sequence to sequence problem is ma- chine translation, which is rapidly evolv- ing due to the development of attention- based encoder-decoder networks. This work applies these modern techniques to abstractive summarization. We perform analysis on various attention mechanisms for summarization with the goal of devel- oping an approach and architecture aimed at improving the state of the art. In par- ticular, we modify and optimize a trans- lation model with self-attention for gener- ating abstractive sentence summaries. The effectiveness of this base model along with attention variants is compared and ana- lyzed in the context of standardized eval- uation sets and test metrics. However, we show that these metrics are limited in their ability to effectively score abstractive summaries, and propose a new approach based on the intuition that an abstractive model requires an abstractive evaluation.
more » « less
Full Text Available
Hierarchical Text Generation using an Outline

Drissi, Mehdi; Kalita, Jugal (December 2018, International Conference on Natural Language Processing)

Many challenges in natural language pro- cessing require generating text, including language translation, dialogue generation, and speech recognition. For all of these problems, text generation becomes more difficult as the text becomes longer. Cur- rent language models often struggle to keep track of coherence for long pieces of text. Here, we attempt to have the model construct and use an outline of the text it generates to keep it focused. We find that the usage of an outline improves perplex- ity. We do not find that using the outline improves human evaluation over a simpler baseline, revealing a discrepancy in per- plexity and human perception. Similarly, hierarchical generation is not found to im- prove human evaluation scores.
more » « less
Full Text Available

« Prev Next »

Search for: All records