Intelligent personal assistant systems, with either text-based or voice-based conversational interfaces, are becoming increasingly popular. Most previous research has used either retrieval-based or generation-based methods. Retrieval-based methods have the advantage of returning fluent and informative responses with great diversity. The retrieved responses are easier to control and explain. However, the response retrieval performance is limited by the size of the response repository. On the other hand, although generation-based methods can return highly coherent responses given conversation context, they are likely to return universal or general responses with insufficient ground knowledge information. In this paper, we build a hybrid neural conversation model with the capability of both response retrieval and generation, in order to combine the merits of these two types of methods. Experimental results on Twitter and Foursquare data show that the proposed model can outperform both retrieval-based methods and generation-based methods (including a recently proposed knowledge-grounded neural conversation model) under both automatic evaluation metrics and human evaluation. Our models and research findings provide new insights on how to integrate text retrieval and text generation models for building conversation systems.
more »
« less
Diverse and faithful knowledge-grounded dialogue generation via sequential posterior inference
The capability to generate responses with diversity and faithfulness using factual knowledge is paramount for creating a human-like, trustworthy dialogue system. Common strategies either adopt a two-step paradigm, which optimizes knowledge selection and response generation separately and may overlook the inherent correlation between these two tasks, or leverage conditional variational method to jointly optimize knowledge selection and response generation by employing an inference network. In this paper, we present an end-to-end learning framework, termed Sequential Posterior Inference (SPI), capable of se- lecting knowledge and generating dialogues by approximately sampling from the posterior distribution. Unlike other methods, SPI does not require the inference network or assume a simple geometry of the posterior distribution. This straightforward and intuitive inference procedure of SPI directly queries the response generation model, allowing for accurate knowledge selection and generation of faithful responses. In addition to modeling contributions, our experimental results on two common dialogue datasets (Wizard of Wikipedia and Holl-E) demonstrate that SPI outperforms previous strong baselines according to both automatic and human evaluation metrics.
more »
« less
- Award ID(s):
- 2015577
- PAR ID:
- 10469435
- Publisher / Repository:
- International Conference on Machine Learning (ICML 2023)
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In recent years, the field of machine learning has made phenomenal progress in the pursuit of simulating real-world data generation processes. One notable example of such success is the variational autoencoder (VAE). In this work, with a small shift in perspective, we leverage and adapt VAEs for a different purpose: uncertainty quantification in scientific inverse problems. We introduce UQ-VAE: a flexible, adaptive, hybrid data/model-informed framework for training neural networks capable of rapid modelling of the posterior distribution representing the unknown parameter of interest. Specifically, from divergence-based variational inference, our framework is derived such that most of the information usually present in scientific inverse problems is fully utilized in the training procedure. Additionally, this framework includes an adjustable hyperparameter that allows selection of the notion of distance between the posterior model and the target distribution. This introduces more flexibility in controlling how optimization directs the learning of the posterior model. Further, this framework possesses an inherent adaptive optimization property that emerges through the learning of the posterior uncertainty.more » « less
-
Effective teamwork depends on teammates’ ability to maintain common ground: mutual knowledge about the relevant state of the world and the relevant status of teammates’ actions and plans. This ability integrates diverse skills of reasoning and communication: agents can track common ground by recognizing and registering public updates to ongoing activity, but when this evidence is incomplete, agents may need to describe what they are doing or ask what others are doing. In this paper, we introduce an architecture for integrating these diverse skills to maintain common ground in human–AI teamwork. Our approach offers unique advantages of simplicity, modularity, and extensibility by leveraging generic tools for plan recognition, planning, natural language understanding and generation, and dialogue management. Worked examples illustrate how linguistic and practical reasoning complement each other in the realization of key interactive skills.more » « less
-
Joan Bruna, Jan S (Ed.)In recent years, the field of machine learning has made phenomenal progress in the pursuit of simulating real-world data generation processes. One notable example of such success is the variational autoencoder (VAE). In this work, with a small shift in perspective, we leverage and adapt VAEs for a different purpose: uncertainty quantification in scientific inverse problems. We introduce UQ-VAE: a flexible, adaptive, hybrid data/model-constrained framework for training neural networks capable of rapid modelling of the posterior distribution representing the unknown parameter of interest. Specifically, from divergence-based variational inference, our framework is derived such that most of the information usually present in scientific inverse problems is fully utilized in the training procedure. Additionally, this framework includes an adjustable hyperparameter that allows selection of the notion of distance between the posterior model and the target distribution. This introduces more flexibility in controlling how optimization directs the learning of the posterior model. Further, this framework possesses an inherent adaptive optimization property that emerges through the learning of the posterior uncertainty. Numerical results for an elliptic PDE-constrained Bayesian inverse problem are provided to verify the proposed framework.more » « less
-
Abstract Bayesian data analysis is increasingly used in ecology, but prior specification remains focused on choosing non‐informative priors (e.g., flat or vague priors). One barrier to choosing more informative priors is that priors must be specified on model parameters (e.g., intercepts, slopes, and sigmas), but prior knowledge often exists on the level of the response variable. This is particularly true for common models in ecology, like generalized linear mixed models that have a link function and potentially dozens of parameters, each of which needs a prior distribution. We suggest that this difficulty can be overcome by simulating from the prior predictive distribution and visualizing the results on the scale of the response variable. In doing so, some common choices for non‐informative priors on parameters can easily be seen to produce biologically impossible values of response variables. Such implications of prior choices are difficult to foresee without visualization. We demonstrate a workflow for prior selection using simulation and visualization with two ecological examples (predator–prey body sizes and spider responses to food competition). This approach is not new, but its adoption by ecologists will help to better incorporate prior information in ecological models, thereby maximizing one of the benefits of Bayesian data analysis.more » « less