NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Bayesian Regression Tree Ensembles that Adapt to Smoothness and Sparsity

https://doi.org/10.1111/rssb.12293

Linero, Antonio R.; Yang, Yun (September 2018, Journal of the Royal Statistical Society Series B: Statistical Methodology)

Summary Ensembles of decision trees are a useful tool for obtaining flexible estimates of regression functions. Examples of these methods include gradient-boosted decision trees, random forests and Bayesian classification and regression trees. Two potential shortcomings of tree ensembles are their lack of smoothness and their vulnerability to the curse of dimensionality. We show that these issues can be overcome by instead considering sparsity inducing soft decision trees in which the decisions are treated as probabilistic. We implement this in the context of the Bayesian additive regression trees framework and illustrate its promising performance through testing on benchmark data sets. We provide strong theoretical support for our methodology by showing that the posterior distribution concentrates at the minimax rate (up to a logarithmic factor) for sparse functions and functions with additive structures in the high dimensional regime where the dimensionality of the covariate space is allowed to grow nearly exponentially in the sample size. Our method also adapts to the unknown smoothness and sparsity levels, and can be implemented by making minimal modifications to existing Bayesian additive regression tree algorithms.
more » « less
Incorporating Grouping Information into Bayesian Decision Tree Ensembles

Du, Junliang; Linero, Antonio Ricardo (July 2019, Proceedings of the 36th International Conference on Machine Learning)

We consider the problem of nonparametric regression in the high-dimensional setting in which P≫N. We study the use of overlapping group structures to improve prediction and variable selection. These structures arise commonly when analyzing DNA microarray data, where genes can naturally be grouped according to genetic pathways. We incorporate overlapping group structure into a Bayesian additive regression trees model using a prior constructed so that, if a variable from some group is used to construct a split, this increases the probability that subsequent splits will use predictors from the same group. We refer to our model as an overlapping group Bayesian additive regression trees (OG-BART) model, and our prior on the splits an overlapping group Dirichlet (OG-Dirichlet) prior. Like the sparse group lasso, our prior encourages sparsity both within and between groups. We study the correlation structure of the prior, illustrate the proposed methodology on simulated data, and apply the methodology to gene expression data to learn which genetic pathways are predictive of breast cancer tumor metastasis.
more » « less
Full Text Available
A Bayesian approach to sequential monitoring of nonlinear profiles using wavelets: Wavelet-Based Bayesian Profile Monitoring

https://doi.org/10.1002/qre.2409

Varbanov, Roumen; Chicken, Eric; Linero, Antonio; Yang, Yun (April 2019, Quality and Reliability Engineering International)

Full Text Available
Interaction Detection with Bayesian Decision Tree Ensembles

Du, Junliang; Linero, Antonio Ricardo (April 2019, Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS))

Methods based on Bayesian decision tree ensembles have proven valuable in constructing high-quality predictions, and are particularly attractive in certain settings because they encourage low-order interaction effects. Despite adapting to the presence of low-order interactions for prediction purpose, we show that Bayesian decision tree ensembles are generally anti-conservative for the purpose of conducting interaction detection. We address this problem by introducing Dirichlet process forests (DP-Forests), which leverage the presence of low-order interactions by clustering the trees so that trees within the same cluster focus on detecting a specific interaction. We show on both simulated and benchmark data that DP-Forests perform well relative to existing interaction detection techniques for detecting low-order interactions, attaining very low false-positive and false-negative rates while maintaining the same performance for prediction using a comparable computational budget.
more » « less
Full Text Available
Multi-rubric models for ordinal spatial data with application to online ratings data

https://doi.org/10.1214/18-AOAS1143

Linero, Antonio R.; Bradley, Jonathan R.; Desai, Apurva (December 2018, The Annals of Applied Statistics)

Full Text Available
Bayesian Approaches for Missing Not at Random Outcome Data: The Role of Identifying Restrictions

https://doi.org/10.1214/17-STS630

Linero, Antonio R.; Daniels, Michael J. (May 2018, Statistical Science)

Full Text Available
A review of tree-based Bayesian methods

https://doi.org/10.29220/CSAM.2017.24.6.543

Linero, Antonio R. (November 2017, Communications for Statistical Applications and Methods)

Full Text Available

Search for: All records