NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Machine Learning-Aided Efficient Decoding of Reed–Muller Subcodes

https://doi.org/10.1109/JSAIT.2023.3298362

Jamali, Mohammad Vahid; Liu, Xiyang; Makkuva, Ashok Vardhan; Mahdavifar, Hessam; Oh, Sewoong; Viswanath, Pramod (January 2023, IEEE Journal on Selected Areas in Information Theory)

Full Text Available
Reed-Muller Subcodes: Machine Learning-Aided Design of Efficient Soft Recursive Decoding

https://doi.org/10.1109/ISIT45174.2021.9517885

Jamali, Mohammad Vahid; Liu, Xiyang; Makkuva, Ashok Vardhan; Mahdavifar, Hessam; Oh, Sewoong; Viswanath, Pramod (July 2021, 2021 IEEE International Symposium on Information Theory (ISIT))
null (Ed.)
Full Text Available
KO codes: inventing nonlinear encoding and decoding for reliable wireless communication via deep-learning

Makkuva, Ashok V; Liu, Xiyang; Jamali, Mohammad Vahid; Mahdavifar, Hessam; Oh, Sewoong; Viswanath, Pramod (July 2021, Proceedings of the 38th International Conference on Machine Learning,)

Full Text Available
Learning in Gated Neural Networks

Makkuva, Ashok Vardhan; Kannan, Sreeram; Oh, Sewoong; Viswanath, Pramod (January 2020, Proceedings of the 23rdInternational Conference on Artificial Intelligence and Statistics (AISTATS) 2020, Palermo, Italy. PMLR: Volume 108.)

Gating is a key feature in modern neural networks including LSTMs, GRUs and sparselygated deep neural networks. The backbone of such gated networks is a mixture-of-experts layer, where several experts make regression decisions and gating controls how to weigh the decisions in an input-dependent manner. Despite having such a prominent role in both modern and classical machine learning, very little is understood about parameter recovery of mixture-of-experts since gradient descent and EM algorithms are known to be stuck in local optima in such models. In this paper, we perform a careful analysis of the optimization landscape and show that with appropriately designed loss functions, gradient descent can indeed learn the parameters of a MoE accurately. A key idea underpinning our results is the design of two distinct loss functions, one for recovering the expert parameters and another for recovering the gating parameters. We demonstrate the first sample complexity results for parameter recovery in this model for any algorithm and demonstrate significant performance gains over standard loss functions in numerical experiments
more » « less
Full Text Available
Breaking the gridlock in Mixture-of-Experts: Consistent and Efficient Algorithms

Makkuva, Ashok; Viswanath, Pramod; Kannan, Sreeram; Oh, Sewoong (June 2019, Proceedings of Machine Learning Research)

Mixture-of-Experts (MoE) is a widely popular model for ensemble learning and is a basic building block of highly successful modern neural networks as well as a component in Gated Recurrent Units (GRU) and Attention networks. However, present algorithms for learning MoE, including the EM algorithm and gradient descent, are known to get stuck in local optima. From a theoretical viewpoint, finding an efficient and provably consistent algorithm to learn the parameters remains a long standing open problem for more than two decades. In this paper, we introduce the first algorithm that learns the true parameters of a MoE model for a wide class of non-linearities with global consistency guarantees. While existing algorithms jointly or iteratively estimate the expert parameters and the gating parameters in the MoE, we propose a novel algorithm that breaks the deadlock and can directly estimate the expert parameters by sensing its echo in a carefully designed cross-moment tensor between the inputs and the output. Once the experts are known, the recovery of gating parameters still requires an EM algorithm; however, we show that the EM algorithm for this simplified problem, unlike the joint EM algorithm, converges to the true parameters. We empirically validate our algorithm on both the synthetic and real data sets in a variety of settings, and show superior performance to standard baselines.
more » « less
Full Text Available
Learning One-hidden-layer Neural Networks under General Input Distributions

Gao, Weihao; Makkuva, Ashok Vardhan; Oh, Sewoong; Viswanath, Pramod (April 2019, Proceedings of Machine Learning Research)

Significant advances have been made recently on training neural networks, where the main challenge is in solving an optimization problem with abundant critical points. However, existing approaches to address this issue crucially rely on a restrictive assumption: the training data is drawn from a Gaussian distribution. In this paper, we provide a novel unified framework to design loss functions with desirable landscape properties for a wide range of general input distributions. On these loss functions, remarkably, stochastic gradient descent theoretically recovers the true parameters with global initializations and empirically outperforms the existing approaches. Our loss function design bridges the notion of score functions with the topic of neural network optimization. Central to our approach is the task of estimating the score function from samples, which is of basic and independent interest to theoretical statistics. Traditional estimation methods (example: kernel based) fail right at the outset; we bring statistical methods of local likelihood to design a novel estimator of score functions, that provably adapts to the local geometry of the unknown density.
more » « less
Full Text Available
Barracuda: The Power of ℓ-polling in Proof-of-Stake Blockchains

Fanti, Giulia; Jiao, Jiantao; Makkuva, Ashok; Oh, Sewoong; Rana, Ranvir; Viswanath, Pramod (July 2019, Proceedings of the ... ACM International Symposium on Mobile Ad Hoc Networking & Computing)

A blockchain is a database of sequential events that is maintained by a distributed group of nodes. A key consensus problem in blockchains is that of determining the next block (data element) in the sequence. Many blockchains address this by electing a new node to propose each new block. The new block is (typically) appended to the tip of the proposer’s local blockchain, and subsequently broadcast to the rest of the network. Without network delay (or adversarial behavior), this procedure would give a perfect chain, since each proposer would have the same view of the blockchain. A major challenge in practice is forking. Due to network delays, a proposer may not yet have the most recent block, and may therefore create a side chain that branches from the middle of the main chain. Forking reduces throughput, since only one a single main chain can survive, and all other blocks are discarded. We propose a new P2P protocol for blockchains called Barracuda, in which each proposer, prior to proposing a block, polls ℓ other nodes for their local blocktree information. Under a stochastic network model, we prove that this lightweight primitive improves throughput as if the entire network were a factor of ℓ faster. We provide guidelines on how to implement Barracuda in practice, guaranteeing robustness against several real-world factors.
more » « less
Full Text Available
Learning in Gated Neural Networks

Makkuva, Ashok; Oh, Sewoong; Kannan, Sreeram; Viswanath, Pramod (January 2019, Proceedings of Machine Learning Research)

Full Text Available
Breaking the gridlock in Mixture-of-Experts: Consistent and Efficient Algorithms

Makkuva, Ashok Vardhan; Oh, Sewoong; Kannan, Sreeram; Viswanath, Pramod (January 2019, International Conference on Machine Learning)

Mixture-of-Experts (MoE) is a widely popular model for ensemble learning and is a basic building block of highly successful modern neural networks as well as a component in Gated Recurrent Units (GRU) and Attention networks. However, present algorithms for learning MoE, including the EM algorithm and gradient descent, are known to get stuck in local optima. From a theoretical viewpoint, finding an efficient and provably consistent algorithm to learn the parameters remains a long standing open problem for more than two decades. In this paper, we introduce the first algorithm that learns the true parameters of a MoE model for a wide class of non-linearities with global consistency guarantees. While existing algorithms jointly or iteratively estimate the expert parameters and the gating parameters in the MoE, we propose a novel algorithm that breaks the deadlock and can directly estimate the expert parameters by sensing its echo in a carefully designed cross-moment tensor between the inputs and the output. Once the experts are known, the recovery of gating parameters still requires an EM algorithm; however, we show that the EM algorithm for this simplified problem, unlike the joint EM algorithm, converges to the true parameters. We empirically validate our algorithm on both the synthetic and real data sets in a variety of settings, and show superior performance to standard baselines.
more » « less
Full Text Available
Barracuda: The Power of ℓ-polling in Proof-of-Stake Blockchains

https://doi.org/10.1145/3323679.3326533

Fanti, Giulia; Jiao, Jiantao; Makkuva, Ashok; Oh, Sewoong; Rana, Ranvir; Viswanath, Pramod (January 2019, act mobihoc)

Full Text Available

Search for: All records