skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Bayesian Online Learning for Consensus Prediction
Given a pre-trained classifier and multiple human experts, we investigate the task of online classification where model predictions are provided for free but querying humans incurs a cost. In this practical but under-explored setting, oracle ground truth is not available. Instead, the prediction target is defined as the consensus vote of all experts. Given that querying full consensus can be costly, we propose a general framework for online Bayesian consensus estimation, leveraging properties of the multivariate hypergeometric distribution. Based on this framework, we propose a family of methods that dynamically estimate expert consensus from partial feedback by producing a posterior over expert and model beliefs. Analyzing this posterior induces an interpretable trade-off between querying cost and classification performance. We demonstrate the efficacy of our framework against a variety of baselines on CIFAR-10H and ImageNet-16H, two large-scale crowdsourced datasets.  more » « less
Award ID(s):
1900644
PAR ID:
10553920
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
International Conference on AI and Statistics, Proceedings of Machine Learning Research
Date Published:
Volume:
238
Page Range / eLocation ID:
2539-2547
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Chiappa, Silvia; Calandra, Roberto (Ed.)
    This paper considers a variant of the classical online learning problem with expert predictions. Our model’s differences and challenges are due to lacking any direct feedback on the loss each expert incurs at each time step $$t$$. We propose an approach that uses peer prediction and identify conditions where it succeeds. Our techniques revolve around a carefully designed peer score function $s()$$ that scores experts’ predictions based on the peer consensus. We show a sufficient condition, that we call \emph{peer calibration}, under which standard online learning algorithms using loss feedback computed by the carefully crafted $$s()$ have bounded regret with respect to the unrevealed ground truth values. We then demonstrate how suitable $s()$ functions can be derived for different assumptions and models. 
    more » « less
  2. Inverse decision theory (IDT) aims to learn a performance metric for classification by eliciting expert classifications on examples. However, elicitation in practical settings may require many classifications of potentially ambiguous examples. To improve the efficiency of elicitation, we propose the cooperative inverse decision theory (CIDT) framework as a formalization of the performance metric elicitation problem. In cooperative inverse decision theory, the expert and a machine play a game where both are rewarded according to the expert’s performance metric, but the machine does not initially know what this function is. We show that optimal policies in this framework produce active learning that leads to an exponential improvement in sample complexity over previous work. One of our key findings is that a broad class of sub-optimal experts can be represented as having uncertain preferences. We use this finding to show such experts naturally fit into our proposed framework extending inverse decision theory to efficiently deal with decision data that is sub-optimal due to noise, conflicting experts, or systematic error 
    more » « less
  3. In classification problems, mislabeled data can have a dramatic effect on the capability of a trained model. The traditional method of dealing with mislabeled data is through expert review. However, this is not always ideal, due to the large volume of data in many classification datasets, such as image datasets supporting deep learning models, and the limited availability of human experts for reviewing the data. Herein, we propose an ordered sample consensus (ORSAC) method to support data cleaning by flagging mislabeled data. This method is inspired by the random sample consensus (RANSAC) method for outlier detection. In short, the method involves iteratively training and testing a model on different splits of the dataset, recording misclassifications, and flagging data that is frequently misclassified as probably mislabeled. We evaluate the method by purposefully mislabeling subsets of data and assessing the method’s capability to find such data. We demonstrate with three datasets, a mosquito image dataset, CIFAR-10, and CIFAR-100, that this method is reliable in finding mislabeled data with a high degree of accuracy. Our experimental results indicate a high proficiency of our methodology in identifying mislabeled data across these diverse datasets, with performance assessed using different mislabeling frequencies. 
    more » « less
  4. Abstract We often need to have beliefs about things on which we are not experts. Luckily, we often have access to expert judgements on such topics. But how should we form our beliefs on the basis of expert opinion when experts conflict in their judgments? This is the core of the novice/2-expert problem in social epistemology. A closely related question is important in the context of policy making: how should a policy maker use expert judgments when making policy in domains in which she is not herself an expert? This question is more complex, given the messy and strategic nature of politics. In this paper we argue that the prediction with expert advice (PWEA) framework from machine learning provides helpful tools for addressing these problems. We outline conditions under which we should expert PWEA to be helpful and those under which we should not expect these methods to perform well. 
    more » « less
  5. We study online convex optimization with switching costs, a practically important but also extremely challenging problem due to the lack of complete offline information. By tapping into the power of machine learning (ML) based optimizers, ML-augmented online algorithms (also referred to as expert calibration in this paper) have been emerging as state of the art, with provable worst-case performance guarantees. Nonetheless, by using the standard practice of training an ML model as a standalone optimizer and plugging it into an ML-augmented algorithm, the average cost performance can be highly unsatisfactory. In order to address the how to learn challenge, we propose EC-L2O (expert-calibrated learning to optimize), which trains an ML-based optimizer by explicitly taking into account the downstream expert calibrator. To accomplish this, we propose a new differentiable expert calibrator that generalizes regularized online balanced descent and offers a provably better competitive ratio than pure ML predictions when the prediction error is large. For training, our loss function is a weighted sum of two different losses --- one minimizing the average ML prediction error for better robustness, and the other one minimizing the post-calibration average cost. We also provide theoretical analysis for EC-L2O, highlighting that expert calibration can be even beneficial for the average cost performance and that the high-percentile tail ratio of the cost achieved by EC-L2O to that of the offline optimal oracle (i.e., tail cost ratio) can be bounded. Finally, we test EC-L2O by running simulations for sustainable datacenter demand response. Our results demonstrate that EC-L2O can empirically achieve a lower average cost as well as a lower competitive ratio than the existing baseline algorithms. 
    more » « less