Online matrix factorization for Markovian dataand applications to Network Dictionary Learning

Lyu, Hanbaek; Needell, Deanna; Balzano, Laura

Online Matrix Factorization (OMF) is a fundamental tool for dictionary learning problems,giving an approximate representation of complex data sets in terms of a reduced number ofextracted features. Convergence guarantees for most of the OMF algorithms in the litera-ture assume independence between data matrices, and the case of dependent data streamsremains largely unexplored. In this paper, we show that a non-convex generalization ofthe well-known OMF algorithm for i.i.d. stream of data in (Mairal et al., 2010) convergesalmost surely to the set of critical points of the expected loss function, even when the datamatrices are functions of some underlying Markov chain satisfying a mild mixing condition.This allows one to extract features more efficiently from dependent data streams, as thereis no need to subsample the data sequence to approximately satisfy the independence as-sumption. As the main application, by combining online non-negative matrix factorizationand a recent MCMC algorithm for sampling motifs from networks, we propose a novelframework ofNetwork Dictionary Learning, which extracts “network dictionary patches”from a given network in an online manner that encodes main features of the network. Wedemonstrate this technique and its application to network denoising problems on real-worldnetwork data

More Like this