Minimax Rates for High-Dimensional Random Tessellation Forests

O'Reilly, Eliza; Tran, Ngoc Mai

Citation Details

Random forests are a popular class of algorithms used for regression and classification. The algorithm introduced by Breiman in 2001 and many of its variants are ensembles of randomized decision trees built from axis-aligned partitions of the feature space. One such variant, called Mondrian forests, was proposed to handle the online setting and is the first class of random forests for which minimax optimal rates were obtained in arbitrary dimension. However, the restriction to axis-aligned splits fails to capture dependencies between features, and random forests that use oblique splits have shown improved empirical performance for many tasks. This work shows that a large class of random forests with general split directions also achieve minimax optimal rates in arbitrary dimension. This class includes STIT forests, a generalization of Mondrian forests to arbitrary split directions, and random forests derived from Poisson hyperplane tessellations. These are the first results showing that random forest variants with oblique splits can obtain minimax optimality in arbitrary dimension. Our proof technique relies on the novel application of the theory of stationary random tessellations in stochastic geometry to statistical learning theory. more »

Award ID(s):: 2002255

PAR ID:: 10560173

Author(s) / Creator(s):: O'Reilly, Eliza; Tran, Ngoc Mai

Publisher / Repository:: Journal of Machine Learning Research

Date Published:: 2024-03-01

Journal Name:: Journal of Machine Learning Research

Volume:: 25

ISSN:: 1533-7928

Page Range / eLocation ID:: 1-32

Subject(s) / Keyword(s):: random forest regression Mondrian process STIT tessellation Poisson hyperplane tessellation minimax risk bound

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
The DOI is not currently available.

More Like this