skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Combining human predictions with model probabilities via confusion matrices and calibration
An increasingly common use case for machine learning models is augmenting the abilities of human decision makers. For classification tasks where neither the human nor model are perfectly accurate, a key step in obtaining high performance is combining their individual predictions in a manner that leverages their relative strengths. In this work, we develop a set of algorithms that combine the probabilistic output of a model with the class-level output of a human. We show theoretically that the accuracy of our combination model is driven not only by the individual human and model accuracies, but also by the model's confidence. Empirical results on image classification with CIFAR-10 and a subset of ImageNet demonstrate that such human-model combinations consistently have higher accuracies than the model or human alone, and that the parameters of the combination method can be estimated effectively with as few as ten labeled datapoints.  more » « less
Award ID(s):
1900644
PAR ID:
10355512
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Advances in Neural Information Processing Systems (NeurIPS 2021)
Volume:
34
Page Range / eLocation ID:
4421-4434
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. An increasingly common use case for machine learning models is augmenting the abilities of human decision makers. For classification tasks where neither the human nor model are perfectly accurate, a key step in obtaining high performance is combining their individual predictions in a manner that leverages their relative strengths. In this work, we develop a set of algorithms that combine the probabilistic output of a model with the class-level output of a human. We show theoretically that the accuracy of our combination model is driven not only by the individual human and model accuracies, but also by the model's confidence. Empirical results on image classification with CIFAR-10 and a subset of ImageNet demonstrate that such human-model combinations consistently have higher accuracies than the model or human alone, and that the parameters of the combination method can be estimated effectively with as few as ten labeled datapoints. 
    more » « less
  2. Algorithmic case-based decision support provides examples to help human make sense of predicted labels and aid human in decision-making tasks. Despite the promising performance of supervised learning, representations learned by supervised models may not align well with human intuitions: what models consider as similar examples can be perceived as distinct by humans. As a result, they have limited effectiveness in case-based decision support. In this work, we incorporate ideas from metric learning with supervised learning to examine the importance of alignment for effective decision support. In addition to instance-level labels, we use human-provided triplet judgments to learn human-compatible decision-focused representations. Using both synthetic data and human subject experiments in multiple classification tasks, we demonstrate that such representation is better aligned with human perception than representation solely optimized for classification. Human-compatible representations identify nearest neighbors that are perceived as more similar by humans and allow humans to make more accurate predictions, leading to substantial improvements in human decision accuracies (17.8% in butterfly vs. moth classification and 13.2% in pneumonia classification). 
    more » « less
  3. Over the last century, direct human modification has been a major driver of coastal wetland degradation, resulting in widespread losses of wetland vegetation and a transition to open water. High-resolution satellite imagery is widely available for monitoring changes in present-day wetlands; however, understanding the rates of wetland vegetation loss over the last century depends on the use of historical panchromatic aerial photographs. In this study, we compared manual image thresholding and an automated machine learning (ML) method in detecting wetland vegetation and open water from historical panchromatic photographs in the Florida Everglades, a subtropical wetland landscape. We compared the same classes delineated in the historical photographs to 2012 multispectral satellite imagery and assessed the accuracy of detecting vegetation loss over a 72 year timescale (1940 to 2012) for a range of minimum mapping units (MMUs). Overall, classification accuracies were >95% across the historical photographs and satellite imagery, regardless of the classification method and MMUs. We detected a 2.3–2.7 ha increase in open water pixels across all change maps (overall accuracies > 95%). Our analysis demonstrated that ML classification methods can be used to delineate wetland vegetation from open water in low-quality, panchromatic aerial photographs and that a combination of images with different resolutions is compatible with change detection. The study also highlights how evaluating a range of MMUs can identify the effect of scale on detection accuracy and change class estimates as well as in determining the most relevant scale of analysis for the process of interest. 
    more » « less
  4. A brain tumor is an abnormal growth in the brain that disrupts its functionality and poses a significant threat to human life by damaging neurons. Early detection and classification of brain tumors are crucial to prevent complications and maintain good health. Recent advancements in deep learning techniques have shown immense potential in image classification and segmentation for tumor identification and classification. In this study, we present a platform, BrainView, for detection, and segmentation of brain tumors from Magnetic Resonance Images (MRI) using deep learning. We utilized EfficientNetB7 pre-trained model to design our proposed DeepBrainNet classification model for analyzing brain MRI images to classify its type. We also proposed a EfficinetNetB7 based image segmentation model, called the EffB7-UNet, for tumor localization. Experimental results show significantly high classification (99.96%) and segmentation (92.734%) accuracies for our proposed models. Finally, we discuss the contours of a cloud application for BrainView using Flask and Flutter to help researchers and clinicians use our machine learning models online for research purposes. 
    more » « less
  5. IEEE Open Journal of the Computer Society (Ed.)
    While neural networks have been achieving increasingly significant excitement in solving classification tasks such as natural language processing, their lack of interpretability becomes a great challenge for neural networks to be deployed in certain high-stakes human-centered applications. To address this issue, we propose a new approach for generating interpretable predictions by inferring a simple three-layer neural network with threshold activations, so that it can benefit from effective neural network training algorithms and at the same time, produce human-understandable explanations for the results. In particular, the hidden layer neurons in the proposed model are trained with floating point weights and binary output activations. The output neuron is also trainable as a threshold logic function that implements a disjunctive operation, forming the logical-OR of the first-level threshold logic functions. This neural network can be trained using state-of-the-art training methods to achieve high prediction accuracy. An important feature of the proposed architecture is that only a simple greedy algorithm is required to provide an explanation with the prediction that is human-understandable. In comparison with other explainable decision models, our proposed approach achieves more accurate predictions on a broad set of tabular data classification datasets. 
    more » « less