NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

State Space Models are Strong Text Rerankers

https://doi.org/10.18653/v1/2025.repl4nlp-1.12

Xu, Zhichao; Yan, Jinghua; Gupta, Ashim; Srikumar, Vivek (May 2025, Proceedings of the 10th Workshop on Representation Learning for NLP (RepL4NLP 2025), Association for Computational Linguistics)

Free, publicly-accessible full text available May 4, 2026
Beyond Perplexity: Multi-dimensional Safety Evaluation of LLM Compression

https://doi.org/10.18653/v1/2024.findings-emnlp.901

Xu, Zhichao; Gupta, Ashim; Li, Tao; Bentham, Oliver; Srikumar, Vivek (January 2024, Association for Computational Linguistics)

Full Text Available
Evaluating Relaxations of Logic for Neural Networks: A Comprehensive Study

https://doi.org/10.24963/ijcai.2021/387

Medina Grespan, Mattia; Gupta, Ashim; Srikumar, Vivek (August 2021, Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-21))

Symbolic knowledge can provide crucial inductive bias for training neural models, especially in low data regimes. A successful strategy for incorporating such knowledge involves relaxing logical statements into sub-differentiable losses for optimization. In this paper, we study the question of how best to relax logical expressions that represent labeled examples and knowledge about a problem; we focus on sub-differentiable t-norm relaxations of logic. We present theoretical and empirical criteria for characterizing which relaxation would perform best in various scenarios. In our theoretical study driven by the goal of preserving tautologies, the Lukasiewicz t-norm performs best. However, in our empirical analysis on the text chunking and digit recognition tasks, the product t-norm achieves best predictive performance. We analyze this apparent discrepancy, and conclude with a list of best practices for defining loss functions via logic.
more » « less
Full Text Available
BERT & Family Eat Word Salad: Experiments with Text Understanding

Gupta, Ashim; Kvernadze, Giorgi; Srikumar, Vivek (April 2021, Proceedings of the AAAI Conference on Artificial Intelligence)
null (Ed.)
In this paper, we study the response of large models from the BERT family to incoherent inputs that should confuse any model that claims to understand natural language. We define simple heuristics to construct such examples. Our experiments show that state-of-the-art models consistently fail to recognize them as ill-formed, and instead produce high confidence predictions on them. As a consequence of this phenomenon, models trained on sentences with randomly permuted word order perform close to state-of-the-art models. To alleviate these issues, we show that if models are explicitly trained to recognize invalid inputs, they can be robust to such attacks without a drop in performance.
more » « less
Full Text Available

Search for: All records