skip to main content


Title: Combining human predictions with model probabilities via confusion matrices and calibration
An increasingly common use case for machine learning models is augmenting the abilities of human decision makers. For classification tasks where neither the human nor model are perfectly accurate, a key step in obtaining high performance is combining their individual predictions in a manner that leverages their relative strengths. In this work, we develop a set of algorithms that combine the probabilistic output of a model with the class-level output of a human. We show theoretically that the accuracy of our combination model is driven not only by the individual human and model accuracies, but also by the model's confidence. Empirical results on image classification with CIFAR-10 and a subset of ImageNet demonstrate that such human-model combinations consistently have higher accuracies than the model or human alone, and that the parameters of the combination method can be estimated effectively with as few as ten labeled datapoints.  more » « less
Award ID(s):
1927245
NSF-PAR ID:
10349255
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Advances in Neural Information Processing Systems, 34
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. An increasingly common use case for machine learning models is augmenting the abilities of human decision makers. For classification tasks where neither the human nor model are perfectly accurate, a key step in obtaining high performance is combining their individual predictions in a manner that leverages their relative strengths. In this work, we develop a set of algorithms that combine the probabilistic output of a model with the class-level output of a human. We show theoretically that the accuracy of our combination model is driven not only by the individual human and model accuracies, but also by the model's confidence. Empirical results on image classification with CIFAR-10 and a subset of ImageNet demonstrate that such human-model combinations consistently have higher accuracies than the model or human alone, and that the parameters of the combination method can be estimated effectively with as few as ten labeled datapoints. 
    more » « less
  2. Over the last century, direct human modification has been a major driver of coastal wetland degradation, resulting in widespread losses of wetland vegetation and a transition to open water. High-resolution satellite imagery is widely available for monitoring changes in present-day wetlands; however, understanding the rates of wetland vegetation loss over the last century depends on the use of historical panchromatic aerial photographs. In this study, we compared manual image thresholding and an automated machine learning (ML) method in detecting wetland vegetation and open water from historical panchromatic photographs in the Florida Everglades, a subtropical wetland landscape. We compared the same classes delineated in the historical photographs to 2012 multispectral satellite imagery and assessed the accuracy of detecting vegetation loss over a 72 year timescale (1940 to 2012) for a range of minimum mapping units (MMUs). Overall, classification accuracies were >95% across the historical photographs and satellite imagery, regardless of the classification method and MMUs. We detected a 2.3–2.7 ha increase in open water pixels across all change maps (overall accuracies > 95%). Our analysis demonstrated that ML classification methods can be used to delineate wetland vegetation from open water in low-quality, panchromatic aerial photographs and that a combination of images with different resolutions is compatible with change detection. The study also highlights how evaluating a range of MMUs can identify the effect of scale on detection accuracy and change class estimates as well as in determining the most relevant scale of analysis for the process of interest. 
    more » « less
  3. This study aims at sensing and understanding the worker’s activity in a human-centered intelligent manufacturing system. We propose a novel multi-modal approach for worker activity recognition by leveraging information from different sensors and in different modalities. Specifically, a smart armband and a visual camera are applied to capture Inertial Measurement Unit (IMU) signals and videos, respectively. For the IMU signals, we design two novel feature transform mechanisms, in both frequency and spatial domains, to assemble the captured IMU signals as images, which allow using convolutional neural networks to learn the most discriminative features. Along with the above two modalities, we propose two other modalities for the video data, i.e., at the video frame and video clip levels. Each of the four modalities returns a probability distribution on activity prediction. Then, these probability distributions are fused to output the worker activity classification result. A worker activity dataset is established, which at present contains 6 common activities in assembly tasks, i.e., grab a tool/part, hammer a nail, use a power-screwdriver, rest arms, turn a screwdriver, and use a wrench. The developed multi-modal approach is evaluated on this dataset and achieves recognition accuracies as high as 97% and 100% in the leave-one-out and half-half experiments, respectively. 
    more » « less
  4. null (Ed.)
    Accurately mapping tree species composition and diversity is a critical step towards spatially explicit and species-specific ecological understanding. The National Ecological Observatory Network (NEON) is a valuable source of open ecological data across the United States. Freely available NEON data include in-situ measurements of individual trees, including stem locations, species, and crown diameter, along with the NEON Airborne Observation Platform (AOP) airborne remote sensing imagery, including hyperspectral, multispectral, and light detection and ranging (LiDAR) data products. An important aspect of predicting species using remote sensing data is creating high-quality training sets for optimal classification purposes. Ultimately, manually creating training data is an expensive and time-consuming task that relies on human analyst decisions and may require external data sets or information. We combine in-situ and airborne remote sensing NEON data to evaluate the impact of automated training set preparation and a novel data preprocessing workflow on classifying the four dominant subalpine coniferous tree species at the Niwot Ridge Mountain Research Station forested NEON site in Colorado, USA. We trained pixel-based Random Forest (RF) machine learning models using a series of training data sets along with remote sensing raster data as descriptive features. The highest classification accuracies, 69% and 60% based on internal RF error assessment and an independent validation set, respectively, were obtained using circular tree crown polygons created with half the maximum crown diameter per tree. LiDAR-derived data products were the most important features for species classification, followed by vegetation indices. This work contributes to the open development of well-labeled training data sets for forest composition mapping using openly available NEON data without requiring external data collection, manual delineation steps, or site-specific parameters. 
    more » « less
  5. Abstract Background

    Binary classification rules based on a small-sample of high-dimensional data (for instance, gene expression data) are ubiquitous in modern bioinformatics. Constructing such classifiers is challenging due to (a) the complex nature of underlying biological traits, such as gene interactions, and (b) the need for highly interpretable glass-box models. We use the theory of high dimensional model representation (HDMR) to build interpretable low dimensional approximations of the log-likelihood ratio accounting for the effects of each individual gene as well as gene-gene interactions. We propose two algorithms approximating the second order HDMR expansion, and a hypothesis test based on the HDMR formulation to identify significantly dysregulated pairwise interactions. The theory is seen as flexible and requiring only a mild set of assumptions.

    Results

    We apply our approach to gene expression data from both synthetic and real (breast and lung cancer) datasets comparing it also against several popular state-of-the-art methods. The analyses suggest the proposed algorithms can be used to obtain interpretable prediction rules with high prediction accuracies and to successfully extract significantly dysregulated gene-gene interactions from the data. They also compare favorably against their competitors across multiple synthetic data scenarios.

    Conclusion

    The proposed HDMR-based approach appears to produce a reliable classifier that additionally allows one to describe how individual genes or gene-gene interactions affect classification decisions. Both real and synthetic data analyses suggest that our methods can be used to identify gene networks with dysregulated pairwise interactions, and are therefore appropriate for differential networks analysis.

     
    more » « less