skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Interpretable Diversity Analysis: Visualization Feature Representations in Low-Cost Ensembles.
Diversity is especially important for low-cost ensemble methods because members often share network structure in order to avoid training several independent models from scratch. Diversity is traditionally analyzed by measuring differences between the outputs of models. However, this gives little insight into how knowledge representations differ between ensemble members. This paper introduces several interpretability methods that can be used to qualitatively analyze diversity. We demonstrate these techniques by comparing the diversity of feature representations between child networks using two low-cost ensemble algorithms, Snapshot Ensembles and Prune and Tune Ensembles. This approach to diversity analysis can lead to valuable insights for how we measure and promote diversity in ensemble methods.  more » « less
Award ID(s):
1908866
PAR ID:
10495863
Author(s) / Creator(s):
;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
IEEE Internation Joint Conferenc on Neural Networks
ISSN:
978-1-6654-8867-9
ISBN:
978-1-6654-8867-9
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. na (Ed.)
    Ensemble Learning is an effective method for improving gen- eralization in machine learning. However, as state-of-the-art neural networks grow larger, the computational cost associ- ated with training several independent networks becomes ex- pensive. We introduce a fast, low-cost method for creating di- verse ensembles of neural networks without needing to train multiple models from scratch. We do this by first training a single parent network. We then create child networks by cloning the parent and dramatically pruning the parameters of each child to create an ensemble of members with unique and diverse topologies. We then briefly train each child net- work for a small number of epochs, which now converge significantly faster when compared to training from scratch. We explore various ways to maximize diversity in the child networks, including the use of anti-random pruning and one- cycle tuning. This diversity enables “Prune and Tune” ensem- bles to achieve results that are competitive with traditional ensembles at a fraction of the training cost. We benchmark our approach against state of the art low-cost ensemble meth- ods and display marked improvement in both accuracy and uncertainty estimation on CIFAR-10 and CIFAR-100. 
    more » « less
  2. NA (Ed.)
    Ensemble Learning is an effective method for improving gen- eralization in machine learning. However, as state-of-the-art neural networks grow larger, the computational cost associ- ated with training several independent networks becomes ex- pensive. We introduce a fast, low-cost method for creating di- verse ensembles of neural networks without needing to train multiple models from scratch. We do this by first training a single parent network. We then create child networks by cloning the parent and dramatically pruning the parameters of each child to create an ensemble of members with unique and diverse topologies. We then briefly train each child net- work for a small number of epochs, which now converge significantly faster when compared to training from scratch. We explore various ways to maximize diversity in the child networks, including the use of anti-random pruning and one- cycle tuning. This diversity enables “Prune and Tune” ensem- bles to achieve results that are competitive with traditional ensembles at a fraction of the training cost. We benchmark our approach against state of the art low-cost ensemble meth- ods and display marked improvement in both accuracy and uncertainty estimation on CIFAR-10 and CIFAR-100. 
    more » « less
  3. Regression ensembles consisting of a collection of base regression models are often used to improve the estimation/prediction performance of a single regression model. It has been shown that the individual accuracy of the base models and the ensemble diversity are the two key factors affecting the performance of an ensemble. In this paper, we derive a theory for regression ensembles that illustrates the subtle trade-off between individual accuracy and ensemble diversity from the perspective of statistical correlations. Then, inspired by our derived theory, we further propose a novel loss function and a training algorithm for deep learning regression ensembles. We then demonstrate the advantage of our training approach over standard regression ensemble methods including random forest and gradient boosting regressors with both benchmark regression problems and chemical sensor problems involving analysis of Raman spectroscopy. Our key contribution is that our loss function and training algorithm is able to manage diversity explicitly in an ensemble, rather than merely allowing diversity to occur by happenstance. 
    more » « less
  4. Abstract Paleoclimate reconstructions are increasingly central to climate assessments, placing recent and future variability in a broader historical context. Paleoclimate reconstructions are increasingly central to climate assessments, placing recent and future variability in a broader historical context. Several estimation methods produce plumes of climate trajectories that practitioners often want to compare to other reconstruction ensembles, or to deterministic trajectories produced by other means, such as global climate models. Of particular interest are “offline” data assimilation (DA) methods, which have recently been adapted to paleoclimatology. Offline DA lacks an explicit model connecting time instants, so its ensemble members are not true system trajectories. This obscures quantitative comparisons, particularly when considering the ensemble mean in isolation. We propose several resampling methods to introduce a priori constraints on temporal behavior, as well as a general notion, called plume distance, to carry out quantitative comparisons between collections of climate trajectories (“plumes”). The plume distance provides a norm in the same physical units as the variable of interest (e.g. °C for temperature), and lends itself to assessments of statistical significance. We apply these tools to four paleoclimate comparisons: (1) global mean surface temperature (GMST) in the online and offline versions of the Last Millennium Reanalysis (v2.1); (2) GMST from these two ensembles to simulations of the Paleoclimate Model Intercomparison Project past1000 ensemble; (3) LMRv2.1 to the PAGES 2k (2019) ensemble of GMST and (4) northern hemisphere mean surface temperature from LMR v2.1 to the Büntgen et al. (2021) ensemble. Results generally show more compatibility between these ensembles than is visually apparent. The proposed methodology is implemented in an open-source Python package, and we discuss possible applications of the plume distance framework beyond paleoclimatology. 
    more » « less
  5. Ensembles of climate model simulations are commonly used to separate externally forced climate change from internal climate variability. However, much of the information gained from running large ensembles is lost in traditional methods of data reduction such as linear trend analysis or large scale spatial averaging. This paper demonstrates a pattern recognition method (forced pattern filtering) that extracts patterns of externally forced climate change from large ensembles and identifies the forced climate response with up to 10 times fewer ensemble members than simple ensemble averaging. It is particularly effective at filtering out spatially coherent modes of internal variability (e.g., El Ni˜no, North Atlantic Oscillation), which would otherwise alias into estimates of regional responses to forcing. This method is used to identify forced climate responses within the 40-member Community Earth System Model (CESM) large ensemble, including an El-Ni˜no-like response to volcanic eruptions and forced trends in the North Atlantic Oscillation. The ensemble-based estimate of the forced response is used to test statistical methods for isolating the forced response from a single realization (i.e., individual ensemble members). Low-frequency pattern filtering is found to effectively identify the forced response within individual ensemble members and is applied to the HadCRUT4 reconstruction of observed temperatures, whereby it identifies slow components of observed temperature changes that are consistent with the expected effects of anthropogenic greenhouse gas and aerosol forcing. 
    more » « less