skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Variational Inference from Ranked Samples with Features
In many supervised learning settings, elicited labels comprise pairwise comparisons or rankings of samples. We propose a Bayesian inference model for ranking datasets, allowing us to take a probabilistic approach to ranking inference. Our probabilistic assumptions are motivated by, and consistent with, the so-called Plackett-Luce model. We propose a variational inference method to extract a closed-form Gaussian posterior distribution. We show experimentally that the resulting posterior yields more reliable ranking predictions compared to predictions via point estimates.  more » « less
Award ID(s):
1750539
PAR ID:
10131982
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of Machine Learning Research
Volume:
101
Page Range / eLocation ID:
599-614
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In practice, it is essential to compare and rank candidate policies offline before real-world deployment for safety and reliability. Prior work seeks to solve this offline policy ranking (OPR) problem through value-based methods, such as Off-policy evaluation (OPE). However, they fail to analyze special case performance (e.g., worst or best cases), due to the lack of holistic characterization of policies’ performance. It is even more difficult to estimate precise policy values when the reward is not fully accessible under sparse settings. In this paper, we present Probabilistic Offline Policy Ranking (POPR), a framework to address OPR problems by leveraging expert data to characterize the probability of a candidate policy behaving like experts, and approximating its entire performance posterior distribution to help with ranking. POPR does not rely on value estimation, and the derived performance posterior can be used to distinguish candidates in worst-, best-, and average-cases. To estimate the posterior, we propose POPR-EABC, an Energy-based Approximate Bayesian Computation (ABC) method conducting likelihood-free inference. POPR-EABC reduces the heuristic nature of ABC by a smooth energy function, and improves the sampling efficiency by a pseudo-likelihood. We empirically demonstrate that POPR-EABC is adequate for evaluating policies in both discrete and continuous action spaces across various experiment environments, and facilitates probabilistic comparisons of candidate policies before deployment. 
    more » « less
  2. We propose an online variational inference framework for joint parameter-state estimation in nonlinear systems. This approach provides a probabilistic estimate of both parameters and states, and does so without relying on a mean-field assumption of independence of the two. The proposed method leverages a factorized form of the target posterior distribution to enable an effective pairing of variational inference for the marginal posterior of parameters with conditional Gaussian filtering for the conditional posterior of the states. This factorization is retrained at every time-step via formulation that combines variational inference and regression. The effectiveness of the framework is demonstrated through applications to two example systems, where it outperforms both the joint Unscented Kalman Filter and Bootstrap Particle Filter parameter-state augmentation in numerical experiments. 
    more » « less
  3. Abstract Inference is crucial in modern astronomical research, where hidden astrophysical features and patterns are often estimated from indirect and noisy measurements. Inferring the posterior of hidden features, conditioned on the observed measurements, is essential for understanding the uncertainty of results and downstream scientific interpretations. Traditional approaches for posterior estimation include sampling-based methods and variational inference (VI). However, sampling-based methods are typically slow for high-dimensional inverse problems, while VI often lacks estimation accuracy. In this paper, we proposeα-deep probabilistic inference, a deep learning framework that first learns an approximate posterior usingα-divergence VI paired with a generative neural network, and then produces more accurate posterior samples through importance reweighting of the network samples. It inherits strengths from both sampling and VI methods: it is fast, accurate, and more scalable to high-dimensional problems than conventional sampling-based approaches. We apply our approach to two high-impact astronomical inference problems using real data: exoplanet astrometry and black hole feature extraction. 
    more » « less
  4. The remarkable success of the Transformer model in Natural Language Processing (NLP) is increasingly capturing the attention of vision researchers in contemporary times. The Vision Transformer (ViT) model effectively models long-range dependencies while utilizing a self-attention mechanism by converting image information into meaningful representations. Moreover, the parallelism property of ViT ensures better scalability and model generalization compared to Recurrent Neural Networks (RNN). However, developing robust ViT models for high-risk vision applications, such as self-driving cars, is critical. Deterministic ViT models are susceptible to noise and adversarial attacks and incapable of yielding a level of confidence in output predictions. Quantifying the confidence (or uncertainty) level in the decision is highly important in such real-world applications. In this work, we introduce a probabilistic framework for ViT to quantify the level of uncertainty in the model's decision. We approximate the posterior distribution of network parameters using variational inference. While progressing through non-linear layers, the first-order Taylor approximation was deployed. The developed framework propagates the mean and covariance of the posterior distribution through layers of the probabilistic ViT model and quantifies uncertainty at the output predictions. Quantifying uncertainty aids in providing warning signals to real-world applications in case of noisy situations. Experimental results from extensive simulation conducted on numerous benchmark datasets (e.g., MNIST and Fashion-MNIST) for image classification tasks exhibit 1) higher accuracy of proposed probabilistic ViT under noise or adversarial attacks compared to the deterministic ViT. 2) Self-evaluation through uncertainty becomes notably pronounced as noise levels escalate. Simulations were conducted at the Texas Advanced Computing Center (TACC) on the Lonestar6 supercomputer node. With the help of this vital resource, we completed all the experiments within a reasonable period. 
    more » « less
  5. We propose that approximate Bayesian algorithms should optimize a new criterion, directly derived from the loss, to calculate their approximate posterior which we refer to as pseudo-posterior. Unlike standard variational inference which optimizes a lower bound on the log marginal likelihood, the new algorithms can be analyzed to provide loss guarantees on the predictions with the pseudo-posterior. Our criterion can be used to derive new sparse Gaussian process algorithms that have error guarantees applicable to various likelihoods. 
    more » « less