skip to main content

Search for: All records

Creators/Authors contains: "Basu, Pallavi"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Summary Model selection is crucial both to high-dimensional learning and to inference for contemporary big data applications in pinpointing the best set of covariates among a sequence of candidate interpretable models. Most existing work implicitly assumes that the models are correctly specified or have fixed dimensionality, yet both model misspecification and high dimensionality are prevalent in practice. In this paper, we exploit the framework of model selection principles under the misspecified generalized linear models presented in Lv & Liu (2014), and investigate the asymptotic expansion of the posterior model probability in the setting of high-dimensional misspecified models. With a natural choice of prior probabilities that encourages interpretability and incorporates the Kullback–Leibler divergence, we suggest using the high-dimensional generalized Bayesian information criterion with prior probability for large-scale model selection with misspecification. Our new information criterion characterizes the impacts of both model misspecification and high dimensionality on model selection. We further establish the consistency of covariance contrast matrix estimation and the model selection consistency of the new information criterion in ultrahigh dimensions under some mild regularity conditions. Our numerical studies demonstrate that the proposed method enjoys improved model selection consistency over its main competitors. 
    more » « less
  2. High‐throughput screening (HTS) is a large‐scale hierarchical process in which a large number of chemicals are tested in multiple stages. Conventional statistical analyses of HTS studies often suffer from high testing error rates and soaring costs in large‐scale settings. This article develops new methodologies for false discovery rate control and optimal design in HTS studies. We propose a two‐stage procedure that determines the optimal numbers of replicates at different screening stages while simultaneously controlling the false discovery rate in the confirmatory stage subject to a constraint on the total budget. The merits of the proposed methods are illustrated using both simulated and real data. We show that, at the expense of a limited budget, the proposed screening procedure effectively controls the error rate and the design leads to improved detection power.

    more » « less