skip to main content

Title: IART: Intent-aware Response Ranking with Transformers in Information-seeking Conversation Systems
Personal assistant systems, such as Apple Siri, Google Now, Amazon Alexa, and Microsoft Cortana, are becoming ever more widely used. Understanding user intent such as clarification questions, potential answers and user feedback in information-seeking conversations is critical for retrieving good responses. In this paper, we analyze user intent patterns in information-seeking conversations and propose an intent-aware neural response ranking model ``IART'', which refers to ``Intent-Aware Ranking with Transformers''. IART is built on top of the integration of user intent modeling and language representation learning with the Transformer architecture, which relies entirely on a self-attention mechanism instead of recurrent nets. It incorporates intent-aware utterance attention to derive an importance weighting scheme of utterances in conversation context with the aim of better conversation history understanding. We conduct extensive experiments with three information-seeking conversation data sets including both standard benchmarks and commercial data. Our proposed model outperforms all baseline methods with respect to a variety of metrics. We also perform case studies and analysis of learned user intent and its impact on response ranking in information-seeking conversations to provide interpretation of results. Our research findings provide insights on intent-aware neural ranking models based on Transformers for response selection, and have implications for the more » design of the next generation of information-seeking conversation systems. « less
Award ID(s):
Publication Date:
Journal Name:
Proceedings of The Web Conference 2020 (WWW 2020)
Page Range or eLocation-ID:
Sponsoring Org:
National Science Foundation
More Like this
  1. Understanding and characterizing how people interact in information-seeking conversations will be a crucial component in developing effective conversational search systems. In this paper, we introduce a new dataset designed for this purpose and use it to analyze information-seeking conversations by user intent distribution, co-occurrence, and flow patterns. The MSDialog dataset is a labeled conversation dataset of question answering (QA) interactions between information seekers and providers from an online forum on Microsoft products. The dataset contains more than 2,000 multi-turn QA dialogs with 10,000 utterances that are annotated with user intents on the utterance level. Annotations were done using crowdsourcing. Withmore »MSDialog, we find some highly recurring patterns in user intent during an information-seeking process. They could be useful for designing conversational search systems. We will make our dataset freely available to encourage exploration of information-seeking conversation models.« less
  2. Asking clarifying questions in response to ambiguous or faceted queries has been recognized as a useful technique for various information retrieval systems, in particular, conversational search systems with limited bandwidth interfaces. Analyzing and generating clarifying question have been recently studied in the literature. However, accurate utilization of user responses to clarifying questions has been relatively less explored. In this paper, we propose a neural network model based on a novel attention mechanism, called multi source attention network. Our model learns a representation for a user-system conversation that includes clarifying questions. In more detail, with the help of multiple information sources,more »our model weights each term in the conversation. In our experiments, we use two separate external sources, including the top retrieved documents and a set of different possible clarifying questions for the query. We implement the proposed representation learning model for two downstream tasks in conversational search; document retrieval and next clarifying question selection. We evaluate our models using a public dataset for search clarification. Our experiments demonstrate significant improvements compared to competitive baselines.« less
  3. Conversational AI is a rapidly developing research field in both industry and academia. As one of the major branches of conversational AI, question answering and conversational search has attracted significant attention of researchers in the information retrieval community. It has been a long overdue feature for search engines or conversational assistants to retrieve information iteratively and interactively in a conversational manner. Previous work argues that conversational question answering (ConvQA) is a simplified but concrete setting of conversational search. In this setting, one of the major challenges is to leverage the conversation history to understand and answer the current question. Inmore »this work, we propose a novel solution for ConvQA that involves three aspects. First, we propose a positional history answer embedding method to encode conversation history with position information using BERT (Bidirectional Encoder Representations from Transformers) in a natural way. BERT is a powerful technique for text representation. Second, we design a history attention mechanism (HAM) to conduct a "soft selection" for conversation histories. This method attends to history turns with different weights based on how helpful they are on answering the current question. Third, in addition to handling conversation history, we take advantage of multi-task learning (MTL) to do answer prediction along with another essential conversation task (dialog act prediction) using a uniform model architecture. MTL is able to learn more expressive and generic representations to improve the performance of ConvQA. We demonstrate the effectiveness of our model with extensive experimental evaluations on QuAC, a large-scale ConvQA dataset. We show that position information plays an important role in conversation history modeling. We also visualize the history attention and provide new insights into conversation history understanding. The complete implementation of our model will be open-sourced.« less
  4. Users often need to look through multiple search result pages or reformulate queries when they have complex information-seeking needs. Conversational search systems make it possible to improve user satisfaction by asking questions to clarify users’ search intents. This, however, can take significant effort to answer a series of questions starting with “what/why/how”. To quickly identify user intent and reduce effort during interactions, we propose an intent clarification task based on yes/no questions where the system needs to ask the correct question about intents within the fewest conversation turns. In this task, it is essential to use negative feedback about themore »previous questions in the conversation history. To this end, we propose a Maximum-Marginal-Relevance (MMR) based BERT model (MMR-BERT) to leverage negative feedback based on the MMR principle for the next clarifying question selection. Experiments on the Qulac dataset show that MMR-BERT outperforms state-of-the-art baselines significantly on the intent identification task and the selected questions also achieve significantly better performance in the associated document retrieval tasks.« less
  5. Entity set expansion (ESE) refers to mining ``siblings'' of some user-provided seed entities from unstructured data. It has drawn increasing attention in the IR and NLP communities for its various applications. To the best of our knowledge, there has not been any work towards a supervised neural model for entity set expansion from unstructured data. We suspect that the main reason is the lack of massive annotated entity sets. In order to solve this problem, we propose and implement a toolkit called {DBpedia-Sets}, which automatically extracts entity sets from any plain text collection and can provide a large number ofmore »distant supervision data for neural model training. We propose a two-channel neural re-ranking model {NESE} that jointly learns exact and semantic matching of entity contexts. The former accepts entity-context co-occurrence information and the latter learns a non-linear transformer from generally pre-trained embeddings to ESE-task specific embeddings for entities. Experiments on real datasets of different scales from different domains show that {NESE} outperforms state-of-the-art approaches in terms of precision and MAP, where the improvements are statistically significant and are higher when the given corpus is larger.« less