skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, April 12 until 2:00 AM ET on Saturday, April 13 due to maintenance. We apologize for the inconvenience.

Title: Alignment Rationale for Query-Document Relevance
Deep neural networks are widely used for text pair classification tasks such as as adhoc retrieval. These deep neural networks are not inherently interpretable and require additional efforts to get rationale behind their decisions. Existing explanation models are not yet capable of inducing alignments between the query terms and the document terms -- which part of the document rationales are responsible for which part of the query? In this paper, we study how the input perturbations can be used to infer or evaluate alignments between the query and document spans, which best explain the black-box ranker’s relevance prediction. We use different perturbation strategies and accordingly propose a set of metrics to evaluate the faithfulness of alignment rationales to the model. Our experiments show that defined metrics based on substitution-based perturbation are more successful in preferring higher-quality alignments, compared to the deletion-based metrics.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 22)
Page Range / eLocation ID:
2489 to 2494
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We address the problem of ad hoc table retrieval via a new neural architecture that incorporates both semantic and relevance matching. Understanding the connection between the structured form of a table and query tokens is an important yet neglected problem in information retrieval. We use a learning- to-rank approach to train a system to capture semantic and relevance signals within interactions between the structured form of candidate tables and query tokens. Convolutional filters that extract contextual features from query/table interactions are combined with a feature vector based on the distributions of term similarity between queries and tables. We propose using row and column summaries to incorporate table content into our new neural model. We evaluate our approach using two datasets, and we demonstrate substantial improvements in terms of retrieval metrics over state-of-the-art methods in table retrieval and document retrieval, and neural architectures from sentence, document, and table type classification adapted to the table retrieval task. Our ablation study supports the importance of both semantic and relevance matching in the table retrieval. 
    more » « less
  2. Martelli, Pier Luigi (Ed.)
    Abstract Motivation Accurate prediction of residue-residue distances is important for protein structure prediction. We developed several protein distance predictors based on a deep learning distance prediction method and blindly tested them in the 14th Critical Assessment of Protein Structure Prediction (CASP14). The prediction method uses deep residual neural networks with the channel-wise attention mechanism to classify the distance between every two residues into multiple distance intervals. The input features for the deep learning method include co-evolutionary features as well as other sequence-based features derived from multiple sequence alignments (MSAs). Three alignment methods are used with multiple protein sequence/profile databases to generate MSAs for input feature generation. Based on different configurations and training strategies of the deep learning method, five MULTICOM distance predictors were created to participate in the CASP14 experiment. Results Benchmarked on 37 hard CASP14 domains, the best performing MULTICOM predictor is ranked 5th out of 30 automated CASP14 distance prediction servers in terms of precision of top L/5 long-range contact predictions (i.e. classifying distances between two residues into two categories: in contact (< 8 Angstrom) and not in contact otherwise) and performs better than the best CASP13 distance prediction method. The best performing MULTICOM predictor is also ranked 6th among automated server predictors in classifying inter-residue distances into 10 distance intervals defined by CASP14 according to the precision of distance classification. The results show that the quality and depth of MSAs depend on alignment methods and sequence databases and have a significant impact on the accuracy of distance prediction. Using larger training datasets and multiple complementary features improves prediction accuracy. However, the number of effective sequences in MSAs is only a weak indicator of the quality of MSAs and the accuracy of predicted distance maps. In contrast, there is a strong correlation between the accuracy of contact/distance predictions and the average probability of the predicted contacts, which can therefore be more effectively used to estimate the confidence of distance predictions and select predicted distance maps. Availability The software package, source code, and data of DeepDist2 are freely available at and 
    more » « less
  3. Uncovering rationales behind predictions of graph neural networks (GNNs) has received increasing attention over recent years. Instance-level GNN explanation aims to discover critical input elements, such as nodes or edges, that the target GNN relies upon for making predictions. Though various algorithms are proposed, most of them formalize this task by searching the minimal subgraph, which can preserve original predictions. However, an inductive bias is deep-rooted in this framework: Several subgraphs can result in the same or similar outputs as the original graphs. Consequently, they have the danger of providing spurious explanations and failing to provide consistent explanations. Applying them to explain weakly performed GNNs would further amplify these issues. To address this problem, we theoretically examine the predictions of GNNs from the causality perspective. Two typical reasons for spurious explanations are identified: confounding effect of latent variables like distribution shift and causal factors distinct from the original input. Observing that both confounding effects and diverse causal rationales are encoded in internal representations,we propose a new explanation framework with an auxiliary alignment loss, which is theoretically proven to be optimizing a more faithful explanation objective intrinsically. Concretely for this alignment loss, a set of different perspectives are explored: anchor-based alignment, distributional alignment based on Gaussian mixture models, mutual-information-based alignment, and so on. A comprehensive study is conducted both on the effectiveness of this new framework in terms of explanation faithfulness/consistency and on the advantages of these variants. For our codes, please refer to the following URL link:

    more » « less
  4. The growth of the Web in recent years has resulted in the development of various online platforms that provide healthcare information services. These platforms contain an enormous amount of information, which could be beneficial for a large number of people. However, navigating through such knowledgebases to answer specific queries of healthcare consumers is a challenging task. A majority of such queries might be non-factoid in nature, and hence, traditional keyword-based retrieval models do not work well for such cases. Furthermore, in many scenarios, it might be desirable to get a short answer that sufficiently answers the query, instead of a long document with only a small amount of useful information. In this paper, we propose a neural network model for ranking documents for question answering in the healthcare domain. The proposed model uses a deep attention mechanism at word, sentence, and document levels, for efficient retrieval for both factoid and non-factoid queries, on documents of varied lengths. Specifically, the word-level cross-attention allows the model to identify words that might be most relevant for a query, and the hierarchical attention at sentence and document levels allows it to do effective retrieval on both long and short documents. We also construct a new large-scale healthcare question-answering dataset, which we use to evaluate our model. Experimental evaluation results against several state-of-the-art baselines show that our model outperforms the existing retrieval techniques. 
    more » « less
  5. Abstract Machine learning can be used to automate common or time-consuming engineering tasks for which sufficient data already exist. For instance, design repositories can be used to train deep learning algorithms to assess component manufacturability; however, methods to determine the suitability of a design repository for use with machine learning do not exist. We provide an initial investigation toward identifying such a method using “artificial” design repositories to experimentally test the extent to which altering properties of the dataset impacts the assessment precision and generalizability of neural networks trained on the data. For this experiment, we use a 3D convolutional neural network to estimate quantitative manufacturing metrics directly from voxel-based component geometries. Additive manufacturing (AM) is used as a case study because of the recent growth of AM-focused design repositories such as GrabCAD and Thingiverse that are readily accessible online. In this study, we focus only on material extrusion, the dominant consumer AM process, and investigate three AM build metrics: (1) part mass, (2) support material mass, and (3) build time. Additionally, we compare the convolutional neural network accuracy to that of a baseline multiple linear regression model. Our results suggest that training on design repositories with less standardized orientation and position resulted in more accurate trained neural networks and that orientation-dependent metrics were harder to estimate than orientation-independent metrics. Furthermore, the convolutional neural network was more accurate than the baseline linear regression model for all build metrics. 
    more » « less