NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Right for the Right Reason: Evidence Extraction for Trustworthy Tabular Reasoning

https://doi.org/10.18653/v1/2022.acl-long.231

Gupta, Vivek; Zhang, Shuo; Vempala, Alakananda; He, Yujie; Choji, Temma; Srikumar, Vivek (January 2022, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics)

Full Text Available
Evaluating Relaxations of Logic for Neural Networks: A Comprehensive Study

https://doi.org/10.24963/ijcai.2021/387

Medina Grespan, Mattia; Gupta, Ashim; Srikumar, Vivek (August 2021, Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-21))

Symbolic knowledge can provide crucial inductive bias for training neural models, especially in low data regimes. A successful strategy for incorporating such knowledge involves relaxing logical statements into sub-differentiable losses for optimization. In this paper, we study the question of how best to relax logical expressions that represent labeled examples and knowledge about a problem; we focus on sub-differentiable t-norm relaxations of logic. We present theoretical and empirical criteria for characterizing which relaxation would perform best in various scenarios. In our theoretical study driven by the goal of preserving tautologies, the Lukasiewicz t-norm performs best. However, in our empirical analysis on the text chunking and digit recognition tasks, the product t-norm achieves best predictive performance. We analyze this apparent discrepancy, and conclude with a list of best practices for defining loss functions via logic.
more » « less
Full Text Available
BERT & Family Eat Word Salad: Experiments with Text Understanding

Gupta, Ashim; Kvernadze, Giorgi; Srikumar, Vivek (April 2021, Proceedings of the AAAI Conference on Artificial Intelligence)
null (Ed.)
In this paper, we study the response of large models from the BERT family to incoherent inputs that should confuse any model that claims to understand natural language. We define simple heuristics to construct such examples. Our experiments show that state-of-the-art models consistently fail to recognize them as ill-formed, and instead produce high confidence predictions on them. As a consequence of this phenomenon, models trained on sentences with randomly permuted word order perform close to state-of-the-art models. To alleviate these issues, we show that if models are explicitly trained to recognize invalid inputs, they can be robust to such attacks without a drop in performance.
more » « less
Full Text Available
A Year of Automated Anomaly Detection in a Datacenter

Ahmed, Rufaida; Porter, Joseph; Abdelmutalab, Abubaker; Ricci, Robert (October 2020, Proceedings of the 2nd workshop on Machine Learning for Computing Systems (MLCS))
null (Ed.)
Anomaly detection based on Machine Learning can be a powerful tool for understanding the behavior of large, complex computer systems in the wild. The set of anomalies seen, however, can change over time: as the system evolves, is put to different uses, and encounters different workloads, both its ‘typical’ behavior and the anomalies that it encounters can change as well. This naturally raises two questions: how effective is automated anomaly detection in this setting, and how much does anomalous behavior change over time? In this paper, we examine these question for a dataset taken from a system that manages the lifecycle of servers in datacenters. We look at logs from one year of operation of a datacenter of about 500 servers. Applying state-of-the art techniques for finding anomalous events, we find that there are a ‘core’ set of anomaly patterns that persist over the entire period studied, but that in to track the evolution of the system, we must re-train the detector periodically. Working with the administrators of this system, we find that, despite these changes in patterns, they still contain actionable insights.
more » « less
Full Text Available
Learning Constraints for Structured Prediction Using Rectifier Networks

Pan, Xingyuan; Mehta, Maitrey; Srikumar, Vivek (July 2020, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics,)

Various natural language processing tasks are structured prediction problems where outputs are constructed with multiple interdependent decisions. Past work has shown that domain knowledge, framed as constraints over the out-put space, can help improve predictive accuracy. However, designing good constraints of-ten relies on domain expertise. In this pa-per, we study the problem of learning such constraints. We frame the problem as that of training a two-layer rectifier network to identify valid structures or substructures, and show a construction for converting a trained net-work into a system of linear constraints over the inference variables. Our experiments on several NLP tasks show that the learned constraints can improve the prediction accuracy,especially when the number of training examples is small.
more » « less
Full Text Available
Structured Tuning for Semantic Role Labeling

Li, Tao; Jawale, Parth Anand; Palmer, Martha; Srikumar, Vivek (July 2020, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics,)

Recent neural network-driven semantic role labeling (SRL) systems have shown impressive improvements in F1 scores. These improvements are due to expressive input representations, which, at least at the surface, are orthogonal to knowledge-rich constrained decoding mechanisms that helped linear SRL models. Introducing the benefits of structure to inform neural models presents a methodological challenge. In this paper, we present a structured tuning framework to improve mod-els using softened constraints only at training time. Our framework leverages the expressive-ness of neural networks and provides supervision with structured loss components. We start with a strong baseline (RoBERTa) to validate the impact of our approach, and show that our framework outperforms the baseline by learning to comply with declarative constraints. Additionally, our experiments with smaller training sizes show that we can achieve consistent improvements under low-resource scenarios
more » « less
Full Text Available
A Logic-Driven Framework for Consistency of Neural Models

https://doi.org/10.18653/v1/D19-1405

Li, Tao; Gupta, Vivek; Mehta, Maitrey; Srikumar, Vivek (November 2019, Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP))

While neural models show remarkable accuracy on individual predictions, their internal beliefs can be inconsistent across examples.In this paper, we formalize such inconsistency as a generalization of prediction error. We propose a learning framework for constraining models using logic rules to regularize them away from inconsistency. Our framework can leverage both labeled and unlabeled examples and is directly compatible with off-the-shelf learning schemes without model redesign. We instantiate our framework on natural language inference, where experiments show that en-forcing invariants stated in logic can help make the predictions of neural models both accurate and consistent
more » « less
Full Text Available
Augmenting Neural Networks with First-order Logic

https://doi.org/10.18653/v1/P19-1028

Li, Tao; Srikumar, Vivek (July 2019, Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics)

Today, the dominant paradigm for training neural networks involves minimizing task loss on a large dataset. Using world knowledge to inform a model, and yet retain the ability to perform end-to-end training remains an open question. In this paper, we present a novel framework for introducing declarative knowledge to neural network architectures in order to guide training and prediction. Our frame-work systematically compiles logical statements into computation graphs that augment a neural network without extra learnable parameters or manual redesign.We evaluate our modeling strategy on three tasks: machine comprehension, natural language inference, and text chunking.Our experiments show that knowledge-augmented networks can strongly improve over baselines, especially in low-data regimes.
more » « less
Full Text Available

Search for: All records