NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Generalized Single Index Models and Jensen Effects on Reproduction and Survival

https://doi.org/10.1007/s13253-021-00452-4

Ye, Zi; Hooker, Giles; Ellner, Stephen P. (September 2021, Journal of Agricultural, Biological and Environmental Statistics)
null (Ed.)
Full Text Available
S-LIME: Stabilized-LIME for Model Explanation

https://doi.org/10.1145/3447548.3467274

Zhou, Zhengze; Hooker, Giles; Wang, Fei (August 2021, KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining)
null (Ed.)
An increasing number of machine learning models have been deployed in domains with high stakes such as finance and healthcare. Despite their superior performances, many models are black boxes in nature which are hard to explain. There are growing efforts for researchers to develop methods to interpret these black-box models. Post hoc explanations based on perturbations, such as LIME [39], are widely used approaches to interpret a machine learning model after it has been built. This class of methods has been shown to exhibit large instability, posing serious challenges to the effectiveness of the method itself and harming user trust. In this paper, we propose S-LIME, which utilizes a hypothesis testing framework based on central limit theorem for determining the number of perturbation points needed to guarantee stability of the resulting explanation. Experiments on both simulated and real world data sets are provided to demonstrate the effectiveness of our method.
more » « less
Full Text Available
Bridging Breiman's Brook: From Algorithmic Modeling to Statistical Learning

https://doi.org/10.1353/obs.2021.0027.

Mentch, Lucas; Hooker, Giles (July 2021, Observational studies)
null (Ed.)
In 2001, Leo Breiman wrote of a divide between "data modeling" and "algorithmic modeling" cultures. Twenty years later this division feels far more ephemeral, both in terms of assigning individuals to camps, and in terms of intellectual boundaries. We argue that this is largely due to the "data modelers" incorporating algorithmic methods into their toolbox, particularly driven by recent developments in the statistical understanding of Breiman's own Random Forest methods. While this can be simplistically described as "Breiman won", these same developments also expose the limitations of the prediction-first philosophy that he espoused, making careful statistical analysis all the more important. This paper outlines these exciting recent developments in the random forest literature which, in our view, occurred as a result of a necessary blending of the two ways of thinking Breiman originally described. We also ask what areas statistics and statisticians might currently overlook.
more » « less
Full Text Available
Unbiased Measurement of Feature Importance in Tree-Based Methods

https://doi.org/10.1145/3429445

Zhou, Zhengze; Hooker, Giles (April 2021, ACM Transactions on Knowledge Discovery from Data)
null (Ed.)
We propose a modification that corrects for split-improvement variable importance measures in Random Forests and other tree-based methods. These methods have been shown to be biased towards increasing the importance of features with more potential splits. We show that by appropriately incorporating split-improvement as measured on out of sample data, this bias can be corrected yielding better summaries and screening tools.
more » « less
Full Text Available
Unbiased Measurement of Feature Importance in Tree-Based Methods

https://doi.org/3429445

Zhou, Zhengze; Hooker, Giles (January 2021, ACM transactions on knowledge discovery from data)
null (Ed.)
We propose a modification that corrects for split-improvement variable importance measures in Random Forests and other tree-based methods. These methods have been shown to be biased towards increasing the importance of features with more potential splits. We show that by appropriately incorporating split-improvement as measured on out of sample data, this bias can be corrected yielding better summaries and screening tools.
more » « less
Full Text Available
Tree Space Prototypes: Another Look at Making Tree Ensembles Interpretable

https://doi.org/3412815.3416893

Tan, Sarah; Soloviev, Matvey; Hooker, Giles; Wells, Martin (October 2020, Foundations of data science)
null (Ed.)
Ensembles of decision trees perform well on many problems, but are not interpretable. In contrast to existing approaches in interpretability that focus on explaining relationships between features and predictions, we propose an alternative approach to interpret tree ensemble classifiers by surfacing representative points for each class -- prototypes. We introduce a new distance for Gradient Boosted Tree models, and propose new, adaptive prototype selection methods with theoretical guarantees, with the flexibility to choose a different number of prototypes in each class. We demonstrate our methods on random forests and gradient boosted trees, showing that the prototypes can perform as well as or even better than the original tree ensemble when used as a nearest-prototype classifier. In a user study, humans were better at predicting the output of a tree ensemble classifier when using prototypes than when using Shapley values, a popular feature attribution method. Hence, prototypes present a viable alternative to feature-based explanations for tree ensembles.
more » « less
Full Text Available
The Jensen effect and functional single index models: Estimating the ecological implications of nonlinear reaction norms

https://doi.org/10.1214/20-AOAS1349

Ye, Zi; Hooker, Giles; Ellner, Stephen P. (September 2020, The Annals of Applied Statistics)
null (Ed.)
Full Text Available
Purifying Interaction Effects with the Functional ANOVA: An Efficient Algorithm for Recovering Identifiable Additive Models

Lengerich, Benjamin; Tan, Sarah; Chang, Chun-Hao; Hooker, Giles; Caruana, Rich (August 2020, Proceedings of Machine Learning Research)
null (Ed.)
Models which estimate main effects of individual variables alongside interaction effects have an identifiability challenge: effects can be freely moved between main effects and interaction effects without changing the model prediction. This is a critical problem for interpretability because it permits “contradictory" models to represent the same function. To solve this problem, we propose pure interaction effects: variance in the outcome which cannot be represented by any subset of features. This definition has an equivalence with the Functional ANOVA decomposition. To compute this decomposition, we present a fast, exact algorithm that transforms any piecewise-constant function (such as a tree-based model) into a purified, canonical representation. We apply this algorithm to Generalized Additive Models with interactions trained on several datasets and show large disparity, including contradictions, between the apparent and the purified effects. These results underscore the need to specify data distributions and ensure identifiability before interpreting model parameters.
more » « less
Full Text Available
Distill-and-Compare: Auditing Black-Box Models Using Transparent Model Distillation

https://doi.org/10.1145/3278721.3278725

Tan, Sarah; Caruana, Rich; Hooker, Giles; Lou, Yin (December 2018, Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society)
null (Ed.)
Black-box risk scoring models permeate our lives, yet are typically proprietary or opaque. We propose Distill-and-Compare, an approach to audit such models without probing the black-box model API or pre-defining features to audit. To gain insight into black-box models, we treat them as teachers, training transparent student models to mimic the risk scores assigned by the black-box models. We compare the mimic model trained with distillation to a second, un-distilled transparent model trained on ground truth outcomes, and use differences between the two models to gain insight into the black-box model. We demonstrate the approach on four data sets: COMPAS, Stop-and-Frisk, Chicago Police, and Lending Club. We also propose a statistical test to determine if a data set is missing key features used to train the black-box model. Our test finds that the ProPublica data is likely missing key feature(s) used in COMPAS.
more » « less
Full Text Available

Search for: All records