NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Training Quantized Neural Networks to Global Optimality via Semidefinite Programming

Bartan, Burak; Pilanci, Mert (April 2021, Proceedings of Machine Learning Research)

Neural networks (NNs) have been extremely successful across many tasks in machine learning. Quantization of NN weights has become an important topic due to its impact on their energy efficiency, inference time and deployment on hardware. Although post-training quantization is well-studied, training optimal quantized NNs involves combinatorial non-convex optimization problems which appear intractable. In this work, we introduce a convex optimization strategy to train quantized NNs with polynomial activations. Our method leverages hidden convexity in two-layer neural networks from the recent literature, semidefinite lifting, and Grothendieck’s identity. Surprisingly, we show that certain quantized NN problems can be solved to global optimality provably in polynomial time in all relevant parameters via tight semidefinite relaxations. We present numerical examples to illustrate the effectiveness of our method.
more » « less
Full Text Available
Neural Fisher Discriminant Analysis: Optimal Neural Network Embeddings in Polynomial Time

Bartan, Burak; Pilanci, Mert (April 2021, Proceedings of Machine Learning Research)

Fisher’s Linear Discriminant Analysis (FLDA) is a statistical analysis method that linearly embeds data points to a lower dimensional space to maximize a discrimination criterion such that the variance between classes is maximized while the variance within classes is minimized. We introduce a natural extension of FLDA that employs neural networks, called Neural Fisher Discriminant Analysis (NFDA). This method finds the optimal two-layer neural network that embeds data points to optimize the same discrimination criterion. We use tools from convex optimization to transform the optimal neural network embedding problem into a convex problem. The resulting problem is easy to interpret and solve to global optimality. We evaluate the method’s performance on synthetic and real datasets.
more » « less
Full Text Available
Convex Neural Autoregressive Models: Towards Tractable, Expressive, and Theoretically-Backed Models for Sequential Forecasting and Generation

https://doi.org/10.1109/ICASSP39728.2021.9413662

Gupta, Vikul; Bartan, Burak; Ergen, Tolga; Pilanci, Mert (June 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP))
null (Ed.)
Three features are crucial for sequential forecasting and generation models: tractability, expressiveness, and theoretical backing. While neural autoregressive models are relatively tractable and offer powerful predictive and generative capabilities, they often have complex optimization landscapes, and their theoretical properties are not well understood. To address these issues, we present convex formulations of autoregressive models with one hidden layer. Specifically, we prove an exact equivalence between these models and constrained, regularized logistic regression by using semi-infinite duality to embed the data matrix onto a higher dimensional space and introducing inequality constraints. To make this formulation tractable, we approximate the constraints using a hinge loss or drop them altogether. Furthermore, we demonstrate faster training and competitive performance of these implementations compared to their neural network counterparts on a variety of data sets. Consequently, we introduce techniques to derive tractable, expressive, and theoretically-interpretable models that are nearly equivalent to neural autoregressive models.
more » « less
Full Text Available
Training Quantized Neural Networks to Global Optimality via Semidefinite Programming

Bartan, Burak; Pilanci, Mert (January 2021, International Conference on Machine Learning (ICML) 2021)

Neural networks (NNs) have been extremely successful across many tasks in machine learning. Quantization of NN weights has become an important topic due to its impact on their energy efficiency, inference time and deployment on hardware. Although post-training quantization is well-studied, training optimal quantized NNs involves combinatorial non-convex optimization problems which appear intractable. In this work, we introduce a convex optimization strategy to train quantized NNs with polynomial activations. Our method leverages hidden convexity in twolayer neural networks from the recent literature, semidefinite lifting, and Grothendieck’s identity. Surprisingly, we show that certain quantized NN problems can be solved to global optimality provably in polynomial time in all relevant parameters via tight semidefinite relaxations. We present numerical examples to illustrate the effectiveness of our method.
more » « less
Full Text Available
Distributed Black-Box optimization via Error Correcting Codes

https://doi.org/10.1109/ALLERTON.2019.8919691

Bartan, Burak; Pilanci, Mert (September 2019, 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton))

We introduce a novel distributed derivative-free optimization framework that is resilient to stragglers. The proposed method employs coded search directions at which the objective function is evaluated, and a decoding step to find the next iterate. Our framework can be seen as an extension of evolution strategies and structured exploration methods where structured search directions were utilized. As an application, we consider black-box adversarial attacks on deep convolutional neural networks. Our numerical experiments demonstrate a significant improvement in the computation times.
more » « less
Full Text Available
Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization

Derezinski, Michal; Bartan, Burak; Pilanci, Mert; Mahoney, Michael W. (January 2020, Conference on Neural Information Processing Systems)
null (Ed.)
In distributed second order optimization, a standard strategy is to average many local estimates, each of which is based on a small sketch or batch of the data. However, the local estimates on each machine are typically biased, relative to the full solution on all of the data, and this can limit the effectiveness of averaging. Here, we introduce a new technique for debiasing the local estimates, which leads to both theoretical and empirical improvements in the convergence rate of distributed second order methods. Our technique has two novel components: (1) modifying standard sketching techniques to obtain what we call a surrogate sketch; and (2) carefully scaling the global regularization parameter for local computations. Our surrogate sketches are based on determinantal point processes, a family of distributions for which the bias of an estimate of the inverse Hessian can be computed exactly. Based on this computation, we show that when the objective being minimized is l2-regularized with parameter ! and individual machines are each given a sketch of size m, then to eliminate the bias, local estimates should be computed using a shrunk regularization parameter given by (See PDF), where d(See PDF) is the (See PDF)-effective dimension of the Hessian (or, for quadratic problems, the data matrix).
more » « less
Full Text Available
Repairing Multiple Failures for Scalar MDS Codes

https://doi.org/10.1109/TIT.2018.2876542

Mardia, Jay; Bartan, Burak; Wootters, Mary (May 2019, IEEE Transactions on Information Theory)

Full Text Available
Adaptive Hashing for Model Counting

Kuck, Jonathan; d Dao, Tri; Zhao, Shenjia; Bartan, Burak; Sabharwal, Ashish; Ermon, Stefano (July 2019, Conference on Uncertainty in Artificial Intelligence)

Full Text Available

Search for: All records