skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Machine truth serum: a surprisingly popular approach to improving ensemble methods
Wisdom of the crowd (Surowiecki, 2005a) disclosed a striking fact that the majority voting answer from a crowd is usually more accurate than a few individual experts. The same story is observed in machine learning - ensemble methods (Dietterich, 2000) leverage this idea to exploit multiple machine learning algorithms in various settings e.g., supervised learning and semi-supervised learning to achieve better performance by aggregating the predictions of different algorithms than that obtained from any constituent algorithm alone. Nonetheless, the existing aggregating rule would fail when the majority answer of all the constituent algorithms is more likely to be wrong. In this paper, we extend the idea proposed in Bayesian Truth Serum (Prelec, 2004) that “a surprisingly more popular answer is more likely to be the true answer instead of the majority one” to supervised classification further improved by ensemble final predictions method and semi-supervised classification (e.g., MixMatch (Berthelot et al., 2019)) enhanced by ensemble data augmentations method. The challenge for us is to define or detect when an answer should be considered as being “surprising”. We present two machine learning aided methods which can reveal the truth when the minority instead of majority has the true answer on both settings of supervised and semi-supervised classification problems. We name our proposed method the Machine Truth Serum. Our experiments on a set of classification tasks (image, text, etc.) show that the classification performance can be further improved by applying Machine Truth Serum in the ensemble final predictions step (supervised) and in the ensemble data augmentations step (semi-supervised).  more » « less
Award ID(s):
2007951
PAR ID:
10391575
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Machine Learning
ISSN:
0885-6125
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this work, we use a generative adversarial network (GAN) to train crowd counting networks using minimal data. We describe how GAN objectives can be modified to allow for the use of unlabeled data to benefit inference training in semi-supervised learning. More generally, we explain how these same methods can be used in more generic multiple regression target semi-supervised learning, with crowd counting being a demonstrative example. Given a convolutional neural network (CNN) with capabilities equivalent to the discriminator in the GAN, we provide experimental results which show that our GAN is able to outperform the CNN even when the CNN has access to significantly more labeled data. This presents the potential of training such networks to high accuracy with little data. Our primary goal is not to outperform the state-of-the-art using an improved method on the entire dataset, but instead we work to show that through semi-supervised learning we can reduce the data required to train an inference network to a given accuracy. To this end, systematic experiments are performed with various numbers of images and cameras to show under which situations the semi-supervised GANs can improve results. 
    more » « less
  2. Recently, using credit cards has been considered one of the essential things of our life due to its pros of being easy to use and flexible to pay. The critical impact of the increment of using credit cards is the occurrence of fraudulent transactions, which allow the illegal user to get money and free goods via unauthorized usage. Artificial Intelligence (AI) and Machine Learning (ML) have become effective techniques used in different applications to ensure cybersecurity. This paper proposes our fraud detection system called Man-Ensemble CCFD using an ensemble-learning model with two stages of classification and detection. Stage one, called ML-CCFD, utilizes ten machine learning (ML) algorithms to classify credit card transactions to class 1 as a fraudulent transaction or class 0 as a legitimate transaction. As a result, we compared their classification reports together, precisely precision, recall (sensitivity), and f1-score. Then, we selected the most accurate ML algorithms based on their classification performance and prediction accuracy. The second stage, known Ensemble-learning CCFD, is an ensemble model that applies the Man-Ensemble method on the most effective ML algorithms from stage one. The output of the second stage is to get the final prediction instead of using common types of ensemble learning, such as voting, stacking, boosting, and others. Our framework’s results showed the effectiveness and efficiency of our fraud detection system compared to using ML algorithms individually due to their weakness issues, such as errors, overfitting, bias, prediction accuracy, and even their robustness level. 
    more » « less
  3. ABSTRACT Machine learning models can greatly improve the search for strong gravitational lenses in imaging surveys by reducing the amount of human inspection required. In this work, we test the performance of supervised, semi-supervised, and unsupervised learning algorithms trained with the ResNetV2 neural network architecture on their ability to efficiently find strong gravitational lenses in the Deep Lens Survey (DLS). We use galaxy images from the survey, combined with simulated lensed sources, as labeled data in our training data sets. We find that models using semi-supervised learning along with data augmentations (transformations applied to an image during training, e.g. rotation) and Generative Adversarial Network (GAN) generated images yield the best performance. They offer 5 – 10 times better precision across all recall values compared to supervised algorithms. Applying the best performing models to the full 20 deg2 DLS survey, we find 3 Grade-A lens candidates within the top 17 image predictions from the model. This increases to 9 Grade-A and 13 Grade-B candidates when 1 per cent (∼2500 images) of the model predictions are visually inspected. This is ≳ 10 × the sky density of lens candidates compared to current shallower wide-area surveys (such as the Dark Energy Survey), indicating a trove of lenses awaiting discovery in upcoming deeper all-sky surveys. These results suggest that pipelines tasked with finding strong lens systems can be highly efficient, minimizing human effort. We additionally report spectroscopic confirmation of the lensing nature of two Grade-A candidates identified by our model, further validating our methods. 
    more » « less
  4. Aggregating signals from a collection of noisy sources is a fundamental problem in many domains including crowd-sourcing, multi-agent planning, sensor networks, signal processing, voting, ensemble learning, and federated learning. The core question is how to aggregate signals from multiple sources (e.g. experts) in order to reveal an underlying ground truth. While a full answer depends on the type of signal, correlation of signals, and desired output, a problem common to all of these applications is that of differentiating sources based on their quality and weighting them accordingly. It is often assumed that this differentiation and aggregation is done by a single, accurate central mechanism or agent (e.g. judge). We complicate this model in two ways. First, we investigate the setting with both a single judge, and one with multiple judges. Second, given this multi-agent interaction of judges, we investigate various constraints on the judges’ reporting space. We build on known results for the optimal weighting of experts and prove that an ensemble of sub-optimal mechanisms can perform optimally under certain conditions. We then show empirically that the ensemble approximates the performance of the optimal mechanism under a broader range of conditions. 
    more » « less
  5. The past decades have witnessed the prosperity of graph mining, with a multitude of sophisticated models and algorithms designed for various mining tasks, such as ranking, classification, clustering and anomaly detection. Generally speaking, the vast majority of the existing works aim to answer the following question, that is, given a graph, what is the best way to mine it? In this paper, we introduce the graph sanitation problem, to an- swer an orthogonal question. That is, given a mining task and an initial graph, what is the best way to improve the initially provided graph? By learning a better graph as part of the input of the mining model, it is expected to benefit graph mining in a variety of settings, ranging from denoising, imputation to defense. We formulate the graph sanitation problem as a bilevel optimization problem, and fur- ther instantiate it by semi-supervised node classification, together with an effective solver named GaSoliNe. Extensive experimental results demonstrate that the proposed method is (1) broadly appli- cable with respect to various graph neural network models and flexible graph modification strategies, (2) effective in improving the node classification accuracy on both the original and contaminated graphs in various perturbation scenarios. In particular, it brings up to 25% performance improvement over the existing robust graph neural network methods. 
    more » « less