NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Robust Low-Rank Tensor Train Recovery

https://doi.org/10.1109/TSP.2025.3561121

Qin, Zhen; Zhu, Zhihui (January 2025, IEEE Transactions on Signal Processing)

Free, publicly-accessible full text available January 1, 2026
Guaranteed Nonconvex Factorization Approach for Tensor Train Recovery

Qin, Zhen; Wakin, Michael B; Zhu, Zhihui (December 2024, Journal of Machine Learning Research)

Free, publicly-accessible full text available December 1, 2025
On the connection between least squares, regularization, and classical shadows

https://doi.org/10.22331/q-2024-08-29-1455

Zhu, Zhihui; Lukens, Joseph M; Kirby, Brian T (August 2024, Quantum)

Classical shadows (CS) offer a resource-efficient means to estimate quantum observables, circumventing the need for exhaustive state tomography. Here, we clarify and explore the connection between CS techniques and least squares (LS) and regularized least squares (RLS) methods commonly used in machine learning and data analysis. By formal identification of LS and RLS ``shadows'' completely analogous to those in CS---namely, point estimators calculated from the empirical frequencies of single measurements---we show that both RLS and CS can be viewed as regularizers for the underdetermined regime, replacing the pseudoinverse with invertible alternatives. Through numerical simulations, we evaluate RLS and CS from three distinct angles: the tradeoff in bias and variance, mismatch between the expected and actual measurement distributions, and the interplay between the number of measurements and number of shots per measurement.Compared to CS, RLS attains lower variance at the expense of bias, is robust to distribution mismatch, and is more sensitive to the number of shots for a fixed number of state copies---differences that can be understood from the distinct approaches taken to regularization. Conceptually, our integration of LS, RLS, and CS under a unifying ``shadow'' umbrella aids in advancing the overall picture of CS techniques, while practically our results highlight the tradeoffs intrinsic to these measurement approaches, illuminating the circumstances under which either RLS or CS would be preferred, such as unverified randomness for the former or unbiased estimation for the latter.
more » « less
Full Text Available
Quantum State Tomography for Matrix Product Density Operators

https://doi.org/10.1109/TIT.2024.3360951

Qin, Zhen; Jameson, Casey; Gong, Zhexuan; Wakin, Michael B; Zhu, Zhihui (July 2024, IEEE Transactions on Information Theory)

Full Text Available
A Global Geometric Analysis of Maximal Coding Rate Reduction

Wang, Peng; Liu, Huikang; Pai, Druv; Yu, Yaodong; Zhu, Zhihui; Qu, Qing; Ma, Yi (June 2024, International Conference on Machine Learning)

The maximal coding rate reduction (MCR2) objective for learning structured and compact deep representations is drawing increasing attention, especially after its recent usage in the derivation of fully explainable and highly effective deep network architectures. However, it lacks a complete theoretical justification: only the properties of its global optima are known, and its global landscape has not been studied. In this work, we give a complete characterization of the properties of all its local and global optima, as well as other types of critical points. Specifically, we show that each (local or global) maximizer of the MCR2 problem corresponds to a low-dimensional, discriminative, and diverse representation, and furthermore, each critical point of the objective is either a local maximizer or a strict saddle point. Such a favorable landscape makes MCR2 a natural choice of objective for learning diverse and discriminative representations via first-order optimization methods. To validate our theoretical findings, we conduct extensive experiments on both synthetic and real data sets.
more » « less
Full Text Available
Generalized Neural Collapse for a Large Number of Classes

Jiang, Jiachen; Zhou, Jinxin; Wang, Peng; Qu, Qing; Mixon, Dustin; You, Chong; Zhu, Zhihui (June 2024, International Conference on Machine Learning)

Neural collapse provides an elegant mathematical characterization of learned last layer representations (a.k.a. features) and classifier weights in deep classification models. Such results not only provide insights but also motivate new techniques for improving practical deep models. However, most of the existing empirical and theoretical studies in neural collapse focus on the case that the number of classes is small relative to the dimension of the feature space. This paper extends neural collapse to cases where the number of classes are much larger than the dimension of feature space, which broadly occur for language models, retrieval systems, and face recognition applications. We show that the features and classifier exhibit a generalized neural collapse phenomenon, where the minimum one-vs-rest margins is maximized. We provide empirical study to verify the occurrence of generalized neural collapse in practical deep neural networks. Moreover, we provide theoretical study to show that the generalized neural collapse provably occurs under unconstrained feature model with spherical constraint, under certain technical conditions on feature dimension and number of classes.
more » « less
Full Text Available
A Global Geometric Analysis of Maximal Coding Rate Reduction

Wang, Peng; Liu, Huikang; Pai, Druv; Yu, Yaodong; Zhu, Zhihui; Qu, Qing; Ma, Yi (June 2024, International Conference in Machine Learning (ICML))

Full Text Available
A Global Geometric Analysis of Maximal Coding Rate Reduction

Wang, Peng; Liu, Huikang; Pai, Druv; Yu, Yaodong; Zhu, Zhihui; Qu, Qing; Ma, Yi (June 2024, International Conference on Machine Learning (ICML))

Full Text Available
Understanding and Improving Transfer Learning of Deep Models via Neural Collapse

Li, Xiao; Liu, Sheng; Zhou, Jinxin; Lu, Xinyu; Fernandez-Granda, Carlos; Zhu, Zhihui; Qu, Qing (June 2024, Transactions on machine learning research)

Full Text Available
Generalized neural collapse for a large number of classes

Jiang, Jiachen; Zhou, Jinxin Zhou; Wang, Peng; Qu, Qing; Mixon, Dustin; You, Chong; Zhu, Zhihui (June 2024, International Conference in Machine Learning (ICML))

Full Text Available

« Prev Next »

Search for: All records