Abstract We propose a very fast approximate Markov chain Monte Carlo sampling framework that is applicable to a large class of sparse Bayesian inference problems. The computational cost per iteration in several regression models is of order O(n(s+J)), where n is the sample size, s is the underlying sparsity of the model, and J is the size of a randomly selected subset of regressors. This cost can be further reduced by data sub-sampling when stochastic gradient Langevin dynamics are employed. The algorithm is an extension of the asynchronous Gibbs sampler of Johnson et al. [(2013). Analyzing Hogwild parallel Gaussian Gibbs sampling. In Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS’13) (Vol. 2, pp. 2715–2723)], but can be viewed from a statistical perspective as a form of Bayesian iterated sure independent screening [Fan, J., Samworth, R., & Wu, Y. (2009). Ultrahigh dimensional feature selection: Beyond the linear model. Journal of Machine Learning Research, 10, 2013–2038]. We show that in high-dimensional linear regression problems, the Markov chain generated by the proposed algorithm admits an invariant distribution that recovers correctly the main signal with high probability under some statistical assumptions. Furthermore, we show that its mixing time is at most linear in the number of regressors. We illustrate the algorithm with several models.
more »
« less
Variable selection and feature screening
This chapter provides a selective review on feature screening methods for ultra-high dimensional data. The main idea of feature screening is reducing the ultrahigh dimensionality of the feature space to a moderate size in a fast and efficient way and meanwhile retaining all the important features in the reduced feature space. This is referred to as the sure screening property. After feature screening, more sophisticated methods can be applied to reduced feature space for further analysis such as parameter estimation and statistical inference. This chapter only focuses on the feature screening stage. From the perspective of different types of data, we review feature screening methods for independent and identically distributed data, longitudinal data and survival data. From the perspective of modeling, we review various models including linear model, generalized linear model, additive model, varying-coefficient model, Cox model, etc. We also cover some model-free feature screening procedures.
more »
« less
- Award ID(s):
- 1820702
- PAR ID:
- 10217340
- Date Published:
- Journal Name:
- Macroeconomic Forecasting in the Era of Big Data
- Page Range / eLocation ID:
- 293-326
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Classical Mixtures of Experts (MoE) are Machine Learning models that involve partitioning the input space, with a separate "expert" model trained on each partition. Recently, MoE-based model architectures have become popular as a means to reduce training and inference costs. There, the partitioning function and the experts are both learnt jointly via gradient descent-type methods on the log-likelihood. In this paper we study theoretical guarantees of the Expectation Maximization (EM) algorithm for the training of MoE models. We first rigorously analyze EM for MoE where the conditional distribution of the target and latent variable conditioned on the feature variable belongs to an exponential family of distributions and show its equivalence to projected Mirror Descent with unit step size and a Kullback-Leibler Divergence regularizer. This perspective allows us to derive new convergence results and identify conditions for local linear convergence; In the special case of mixture of 2 linear or logistic experts, we additionally provide guarantees for linear convergence based on the signal-to-noise ratio. Experiments on synthetic and (small-scale) real-world data supports that EM outperforms the gradient descent algorithm both in terms of convergence rate and the achieved accuracy.more » « less
-
Behavioural research and its applications has a rich history in science with direct applications continuing to expand global understanding of ecosystem function, structure, and evolution. The same can be said for such research as related to the applied sciences including entomology. The purpose of this chapter is to provide context to various approaches for assessing behaviour of insects that are mass produced for food and feed. By using the black soldier fly as a model, various approaches for conducting such research are explored along with providing some perspective on the value of such data for optimising insect production. However, it should be noted that this chapter is not exhaustive with regards to variables that can be examined, or the methods employed.more » « less
-
This work proposes a new computational framework for learning a structured generative model for real-world datasets. In particular, we propose to learn a Closed-loop Transcriptionbetween a multi-class, multi-dimensional data distribution and a Linear discriminative representation (CTRL) in the feature space that consists of multiple independent multi-dimensional linear subspaces. In particular, we argue that the optimal encoding and decoding mappings sought can be formulated as a two-player minimax game between the encoder and decoderfor the learned representation. A natural utility function for this game is the so-called rate reduction, a simple information-theoretic measure for distances between mixtures of subspace-like Gaussians in the feature space. Our formulation draws inspiration from closed-loop error feedback from control systems and avoids expensive evaluating and minimizing of approximated distances between arbitrary distributions in either the data space or the feature space. To a large extent, this new formulation unifies the concepts and benefits of Auto-Encoding and GAN and naturally extends them to the settings of learning a both discriminative and generative representation for multi-class and multi-dimensional real-world data. Our extensive experiments on many benchmark imagery datasets demonstrate tremendous potential of this new closed-loop formulation: under fair comparison, visual quality of the learned decoder and classification performance of the encoder is competitive and arguably better than existing methods based on GAN, VAE, or a combination of both. Unlike existing generative models, the so-learned features of the multiple classes are structured instead of hidden: different classes are explicitly mapped onto corresponding independent principal subspaces in the feature space, and diverse visual attributes within each class are modeled by the independent principal components within each subspace.more » « less
-
Abstract Phylogenetic regression is a type of generalised least squares (GLS) method that incorporates a modelled covariance matrix based on the evolutionary relationships between species (i.e. phylogenetic relationships). While this method has found widespread use in hypothesis testing via phylogenetic comparative methods, such as phylogenetic ANOVA, its ability to account for non‐linear relationships has received little attention.To address this, here we implement a phylogenetic Kernel Ridge Regression (phyloKRR) method that utilises GLS in a high‐dimensional feature space, employing linear combinations of phylogenetically weighted data to account for non‐linearity. We analysed two biological datasets using the Radial Basis Function and linear kernel function. The first dataset contained morphometric data, while the second dataset comprised discrete trait data and diversification rates as response variable. Hyperparameter tuning of the model was achieved through cross‐validation rounds in the training set.In the tested biological datasets, phyloKRR reduced the error rate (as measured by RMSE) by around 20% compared to linear‐based regression when data did not exhibit linear relationships. In simulated datasets, the error rate decreased almost exponentially with the level of non‐linearity.These results show that introducing kernels into phylogenetic regression analysis presents a novel and promising tool for complementing phylogenetic comparative methods. We have integrated this method into Python package named phyloKRR, which is freely available at:https://github.com/ulises‐rosas/phylokrr.more » « less
An official website of the United States government

