A Gaussian latent variable model for incomplete mixed type data

Ajirak, Marzieh; Djurić, Petar M

Citation Details

In many machine learning problems, one has to work with data of different types, including continuous, discrete, and categorical data. Further, it is often the case that many of these data are missing from the database. This paper proposes a Gaussian process framework that efficiently captures the information from mixed numerical and categorical data that effectively incorporates missing variables. First, we propose a generative model for the mixed-type data. The generative model exploits Gaussian processes with kernels constructed from the latent vectors. We also propose a method for inference of the unknowns, and in its implementation, we rely on a sparse spectrum approximation of the Gaussian processes and variational inference. We demonstrate the performance of the method for both supervised and unsupervised tasks. First, we investigate the imputation of missing variables in an unsupervised setting, and then we show the results of joint imputation and classification on IBM employee data. more »

Award ID(s):: 2212506

PAR ID:: 10417055

Author(s) / Creator(s):: Ajirak, Marzieh; Djurić, Petar M

Publisher / Repository:: IEEE

Date Published:: 2023-01-01

Journal Name:: Conference record IEEE International Conference on Acoustics Speech and Signal Processing

ISSN:: 0749-842X

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this