NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

WAVES: benchmarking the robustness of image watermarks

An, Bang; Ding, Mucong; Rabbani, Tahseen; Agrawal, Aakriti; Xu, Yuancheng; Deng, Chenghao; Zhu, Sicheng; Mohamed, Abdirisak; Wen, Yuxin; Goldstein, Tom; et al (July 2024, Proceedings of the 41st International Conference on Machine Learning)

Full Text Available
SAFLEX: Self-Adaptive Augmentation via Feature Label Extrapolation

Ding, Mucong; An, Bang; Xu, Yuancheng; Satheesh, Anirudh; Huang, Furong (January 2024, The Twelfth International Conference on Learning Representations)
Transferring Fairness under Distribution Shifts via Fair Consistency Regularization

An, Bang; Che, Zora; Ding, Mucong; Huang, Furong (January 2022, 36th Conference on Neural Information Processing Systems (NeurIPS 2022))

Full Text Available
GANs with Conditional Independence Graphs: On Subadditivity of Probability Divergences

Ding, Mucong; Daskalakis, Constantinos; Feizi, Soheil (January 2021, the 24th International Conference on Artificial Intelligence and Statistics (AISTATS))
null (Ed.)
Full Text Available
UNDERSTANDING OVERPARAMETERIZATION IN GENERATIVE ADVERSARIAL NETWORKS

Balaji, Yogesh; Sajedi, Mohammadmahdi; Kalibhat, Neha Mukund; Ding, Mucong; Stoger, Dominik; Soltanolkotabi, Mahdi; Feizi, Soheil (April 2021, International Conference on Learning Representations)
null (Ed.)
A broad class of unsupervised deep learning methods such as Generative Adversarial Networks (GANs) involve training of overparameterized models where the number of parameters of the model exceeds a certain threshold. Indeed, most successful GANs used in practice are trained using overparameterized generator and discriminator networks, both in terms of depth and width. A large body of work in supervised learning have shown the importance of model overparameterization in the convergence of the gradient descent (GD) to globally optimal solutions. In contrast, the unsupervised setting and GANs in particular involve non-convex concave mini-max optimization problems that are often trained using Gradient Descent/Ascent (GDA). The role and benefits of model overparameterization in the convergence of GDA to a global saddle point in non-convex concave problems is far less understood. In this work, we present a comprehensive analysis of the importance of model overparameterization in GANs both theoretically and empirically. We theoretically show that in an overparameterized GAN model with a 1-layer neural network generator and a linear discriminator, GDA converges to a global saddle point of the underlying non-convex concave min-max problem. To the best of our knowledge, this is the first result for global convergence of GDA in such settings. Our theory is based on a more general result that holds for a broader class of nonlinear generators and discriminators that obey certain assumptions (including deeper generators and random feature discriminators). Our theory utilizes and builds upon a novel connection with the convergence analysis of linear timevarying dynamical systems which may have broader implications for understanding the convergence behavior of GDA for non-convex concave problems involving overparameterized models. We also empirically study the role of model overparameterization in GANs using several large-scale experiments on CIFAR-10 and Celeb-A datasets. Our experiments show that overparameterization improves the quality of generated samples across various model architectures and datasets. Remarkably, we observe that overparameterization leads to faster and more stable convergence behavior of GDA across the board.
more » « less
Full Text Available
VQ-GNN: A Universal Framework to Scale up Graph Neural Networks using Vector Quantization

Ding, Mucong; Kong, Kezhi; Li, Jingling; Zhu, Chen; Dickerson, John P; Huang, Furong; Goldstein, Tom (January 2021, Advances in Neural Information Processing Systems 34)

Most state-of-the-art Graph Neural Networks (GNNs) can be defined as a form of graph convolution which can be realized by message passing between direct neighbors or beyond. To scale such GNNs to large graphs, various neighbor-, layer-, or subgraph-sampling techniques are proposed to alleviate the "neighbor explosion" problem by considering only a small subset of messages passed to the nodes in a mini-batch. However, sampling-based methods are difficult to apply to GNNs that utilize many-hops-away or global context each layer, show unstable performance for different tasks and datasets, and do not speed up model inference. We propose a principled and fundamentally different approach, VQ-GNN, a universal framework to scale up any convolution-based GNNs using Vector Quantization (VQ) without compromising the performance. In contrast to sampling-based techniques, our approach can effectively preserve all the messages passed to a mini-batch of nodes by learning and updating a small number of quantized reference vectors of global node representations, using VQ within each GNN layer. Our framework avoids the "neighbor explosion" problem of GNNs using quantized representations combined with a low-rank version of the graph convolution matrix. We show that such a compact low-rank version of the gigantic convolution matrix is sufficient both theoretically and experimentally. In company with VQ, we design a novel approximated message passing algorithm and a nontrivial back-propagation rule for our framework. Experiments on various types of GNN backbones demonstrate the scalability and competitive performance of our framework on large-graph node classification and link prediction benchmarks.
more » « less
Full Text Available

Search for: All records