Assessment of Projection Pursuit Index for Classifying High Dimension Low Sample Size Data in R

Wu, Zhaoxing; Zhang, Chunming

doi:10.6339/23-JDS1096

Citation Details

Assessment of Projection Pursuit Index for Classifying High Dimension Low Sample Size Data in R

Analyzing “large p small n” data is becoming increasingly paramount in a wide range of application fields. As a projection pursuit index, the Penalized Discriminant Analysis ($$\mathrm{PDA}$$) index, built upon the Linear Discriminant Analysis ($$\mathrm{LDA}$$) index, is devised in Lee and Cook (2010) to classify high-dimensional data with promising results. Yet, there is little information available about its performance compared with the popular Support Vector Machine ($$\mathrm{SVM}$$). This paper conducts extensive numerical studies to compare the performance of the $$\mathrm{PDA}$$ index with the $$\mathrm{LDA}$$ index and $$\mathrm{SVM}$$, demonstrating that the $$\mathrm{PDA}$$ index is robust to outliers and able to handle high-dimensional datasets with extremely small sample sizes, few important variables, and multiple classes. Analyses of several motivating real-world datasets reveal the practical advantages and limitations of individual methods, suggesting that the $$\mathrm{PDA}$$ index provides a useful alternative tool for classifying complex high-dimensional data. These new insights, along with the hands-on implementation of the $$\mathrm{PDA}$$ index functions in the R package classPP, facilitate statisticians and data scientists to make effective use of both sets of classification tools. more »

Award ID(s):: 2013486 1712418

PAR ID:: 10454058

Author(s) / Creator(s):: Wu, Zhaoxing; Zhang, Chunming

Date Published:: 2023-03-02

Journal Name:: Journal of Data Science

Volume:: 21

Issue:: 2

ISSN:: 1680-743X

Page Range / eLocation ID:: 310 to 332

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.6339/23-JDS1096

More Like this