skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Strong Optimal Classification Trees
Decision trees are among the most interpretable and popular machine learning models that are used routinely in applications ranging from revenue management to medicine. Traditional heuristic methods, although fast, lack modeling flexibility for incorporating constraints such as fairness and do not guarantee optimality. Recent efforts aim to overcome these limitations using mixed-integer optimization (MIO) for better modeling flexibility and optimality, but speed remains an issue. In “Strong Optimal Classification Trees,” Aghaei, Gómez, and Vayanos use integer optimization and polyhedral theory to create an MIO-based formulation with a strong LO relaxation resulting in a 29% speed-up in training time compared with state-of-the-art MIO-based formulations, as well as up to an 8% improvement in out-of-sample accuracy.  more » « less
Award ID(s):
2046230
PAR ID:
10587249
Author(s) / Creator(s):
; ;
Publisher / Repository:
INFORMS
Date Published:
Journal Name:
Operations Research
ISSN:
0030-364X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The L 0 -regularized least squares problem (a.k.a. best subsets) is central to sparse statistical learning and has attracted significant attention across the wider statistics, machine learning, and optimization communities. Recent work has shown that modern mixed integer optimization (MIO) solvers can be used to address small to moderate instances of this problem. In spite of the usefulness of L 0 -based estimators and generic MIO solvers, there is a steep computational price to pay when compared with popular sparse learning algorithms (e.g., based on L 1 regularization). In this paper, we aim to push the frontiers of computation for a family of L 0 -regularized problems with additional convex penalties. We propose a new hierarchy of necessary optimality conditions for these problems. We develop fast algorithms, based on coordinate descent and local combinatorial optimization, that are guaranteed to converge to solutions satisfying these optimality conditions. From a statistical viewpoint, an interesting story emerges. When the signal strength is high, our combinatorial optimization algorithms have an edge in challenging statistical settings. When the signal is lower, pure L 0 benefits from additional convex regularization. We empirically demonstrate that our family of L 0 -based estimators can outperform the state-of-the-art sparse learning algorithms in terms of a combination of prediction, estimation, and variable selection metrics under various regimes (e.g., different signal strengths, feature correlations, number of samples and features). Our new open-source sparse learning toolkit L0Learn (available on CRAN and GitHub) reaches up to a threefold speedup (with p up to 10 6 ) when compared with competing toolkits such as glmnet and ncvreg. 
    more » « less
  2. We study the problem of learning, from observational data, fair and interpretable policies that effectively match heterogeneous individuals to scarce resources of different types. We model this problem as a multi-class multi-server queuing system where both individuals and resources arrive stochastically over time. Each individual, upon arrival, is assigned to a queue where they wait to be matched to a resource. The resources are assigned in a first come first served (FCFS) fashion according to an eligibility structure that encodes the resource types that serve each queue. We propose a methodology based on techniques in modern causal inference to construct the individual queues as well as learn the matching outcomes and provide a mixed-integer optimization (MIO) formulation to optimize the eligibility structure. The MIO problem maximizes policy outcome subject to wait time and fairness constraints. It is very flexible, allowing for additional linear domain constraints. We conduct extensive analyses using synthetic and real-world data. In particular, we evaluate our framework using data from the U.S. Homeless Management Information System (HMIS). We obtain wait times as low as an FCFS policy while improving the rate of exit from homelessness for underserved or vulnerable groups (7% higher for the Black individuals and 15% higher for those below 17 years old) and overall. 
    more » « less
  3. This paper proposes a new mixed-integer programming (MIP) formulation to optimize split rule selection in the decision tree induction process and develops an efficient search algorithm that is able to solve practical instances of the MIP model faster than commercial solvers. The formulation is novel for it directly maximizes the Gini reduction, an effective split selection criterion that has never been modeled in a mathematical program for its nonconvexity. The proposed approach differs from other optimal classification tree models in that it does not attempt to optimize the whole tree; therefore, the flexibility of the recursive partitioning scheme is retained, and the optimization model is more amenable. The approach is implemented in an open-source R package named bsnsing. Benchmarking experiments on 75 open data sets suggest that bsnsing trees are the most capable of discriminating new cases compared with trees trained by other decision tree codes including the rpart, C50, party, and tree packages in R. Compared with other optimal decision tree packages, including DL8.5, OSDT, GOSDT, and indirectly more, bsnsing stands out in its training speed, ease of use, and broader applicability without losing in prediction accuracy. History: Accepted by RamRamesh, Area Editor for Data Science & Machine Learning. Funding: This work was supported by the National Science Foundation Division of Civil, MechanicalandManufacturing Innovation [Grant 1944068]. Supplemental Material: Data are available at https://doi.org/10.1287/ijoc.2022.1225 . 
    more » « less
  4. Large workloads of event trend aggregation queries are widely deployed to derive high-level insights about current event trends in near real time. To speed-up the execution, we identify and leverage sharing opportunities from complex patterns with flat Kleene operators or even nested Kleene expressions. We propose Gloria, a graph-based sharing optimizer for event trend aggregation. First, we map the sharing optimization problem to a graph path search problem in the Gloria graph with execution costs encoded as weights. Second, we shrink the search space by applying cost-driven pruning principles that guarantee optimality of the reduced Gloria graph in most cases. Lastly, we propose a path search algorithm that identifies the sharing plan with minimum execution costs. Our experimental study on three real-world data sets demonstrates that our Gloria optimizer effectively reduces the search space, leading to 5-fold speed-up in optimization time. The optimized plan consistently reduces the query latency by 68%-93% compared to the plan generated by state-of-the-art approaches. 
    more » « less
  5. Trajectory optimization o↵ers mature tools for motion planning in high-dimensional spaces under dynamic constraints. However, when facing complex configuration spaces, cluttered with obstacles, roboticists typically fall back to sampling-based planners that struggle in very high dimensions and with continuous di↵erential constraints. Indeed, obstacles are the source of many textbook examples of problematic nonconvexities in the trajectory-optimization prob- lem. Here we show that convex optimization can, in fact, be used to reliably plan trajectories around obstacles. Specifically, we consider planning problems with collision-avoidance constraints, as well as cost penalties and hard constraints on the shape, the duration, and the velocity of the trajectory. Combining the properties of B ́ezier curves with a recently-proposed framework for finding shortest paths in Graphs of Convex Sets (GCS), we formulate the planning problem as a compact mixed-integer optimization. In stark contrast with existing mixed-integer planners, the convex relaxation of our programs is very tight, and a cheap round- ing of its solution is typically sufficient to design globally-optimal trajectories. This reduces the mixed-integer program back to a simple convex optimization, and automatically provides optimality bounds for the planned trajectories. We name the proposed planner GCS, after its underlying optimization framework. We demonstrate GCS in simulation on a variety of robotic platforms, including a quadrotor flying through buildings and a dual-arm manipulator (with fourteen degrees of freedom) moving in a confined space. Using numerical experiments on a seven-degree-of-freedom manipulator, we show that GCS can outperform widely-used sampling-based planners by finding higher-quality trajectories in less time. 
    more » « less