skip to main content

Search for: All records

Creators/Authors contains: "Liu, Yanchao"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available July 1, 2024
  2. This paper proposes a new mixed-integer programming (MIP) formulation to optimize split rule selection in the decision tree induction process and develops an efficient search algorithm that is able to solve practical instances of the MIP model faster than commercial solvers. The formulation is novel for it directly maximizes the Gini reduction, an effective split selection criterion that has never been modeled in a mathematical program for its nonconvexity. The proposed approach differs from other optimal classification tree models in that it does not attempt to optimize the whole tree; therefore, the flexibility of the recursive partitioning scheme is retained, and the optimization model is more amenable. The approach is implemented in an open-source R package named bsnsing. Benchmarking experiments on 75 open data sets suggest that bsnsing trees are the most capable of discriminating new cases compared with trees trained by other decision tree codes including the rpart, C50, party, and tree packages in R. Compared with other optimal decision tree packages, including DL8.5, OSDT, GOSDT, and indirectly more, bsnsing stands out in its training speed, ease of use, and broader applicability without losing in prediction accuracy. History: Accepted by RamRamesh, Area Editor for Data Science & Machine Learning.more »Funding: This work was supported by the National Science Foundation Division of Civil, MechanicalandManufacturing Innovation [Grant 1944068]. Supplemental Material: Data are available at .« less
    Free, publicly-accessible full text available November 1, 2023
  3. Free, publicly-accessible full text available August 1, 2023
  4. Free, publicly-accessible full text available August 1, 2023
  5. Abstract

    The p-center location problem in an area is an important yet very difficult problem in location science. The objective is to determine the location of p hubs within a service area so that the distance from any point in the area to its nearest hub is as small as possible. While effective heuristic methods exist for finding good feasible solutions, research work that probes the lower bound of the problem’s objective value is still limited. This paper presents an iterative solution framework along with two optimization-based heuristics for computing and improving the lower bound, which is at the core of the problem’s difficulty. One method obtains the lower bound via solving the discrete version of the Euclidean p-center problem, and the other via solving a relatively easier clustering problem. Both methods have been validated in various test cases, and their performances can serve as a benchmark for future methodological improvements.