Masked Prediction: A Parameter Identifiability View

Liu, Bingbin; Hsu, Daniel J.; Ravikumar, Pradeep; Risteski, Andrej

Citation Details

The vast majority of work in self-supervised learning have focused on assessing recovered features by a chosen set of downstream tasks. While there are several commonly used benchmark datasets, this lens of feature learning requires assumptions on the downstream tasks which are not inherent to the data distribution itself. In this paper, we present an alternative lens, one of parameter identifiability: assuming data comes from a parametric probabilistic model, we train a self-supervised learning predictor with a suitable parametric form, and ask whether the parameters of the optimal predictor can be used to extract the parameters of the ground truth generative model. Specifically, we focus on latent-variable models capturing sequential structures, namely Hidden Markov Models with both discrete and conditionally Gaussian observations. We focus on masked prediction as the self-supervised learning task and study the optimal masked predictor. We show that parameter identifiability is governed by the task difficulty, which is determined by the choice of data model and the amount of tokens to predict. Technique-wise, we uncover close connections with the uniqueness of tensor rank decompositions, a widely used tool in studying identifiability through the lens of the method of moments. more »

Award ID(s):: 2211907

PAR ID:: 10450563

Author(s) / Creator(s):: Liu, Bingbin; Hsu, Daniel J.; Ravikumar, Pradeep; Risteski, Andrej

Date Published:: 2022-01-01

Journal Name:: Advances in neural information processing systems

ISSN:: 1049-5258

Page Range / eLocation ID:: 21241-21254

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this