skip to main content

Title: NeuroBE: Escalating neural network approximations of Bucket Elimination
A major limiting factor in graphical model inference is the complexity of computing the partition function. Exact message-passing algorithms such as Bucket Elimination (BE) require exponential memory to compute the partition function; therefore, approximations are necessary. In this paper, we build upon a recently introduced methodology called Deep Bucket Elimination (DBE) that uses classical Neural Networks to approximate messages generated by BE for large buckets. The main feature of our new scheme, renamed NeuroBE, is that it customizes the architecture of the neural networks, their learning process and in particular, adapts the loss function to the internal form or distribution of messages. Our experiments demonstrate significant improvements in accuracy and time compared with the earlier DBE scheme.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ;
Cussens, James; Zhang, Kun
Date Published:
Journal Name:
Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, PLMR
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)

    Bucket Elimination (BE) is a universal inference scheme that can solve most tasks over probabilistic and deterministic graphical models exactly.However, it often requires exponentially high levels of memory (in the induced-width) preventing its execution. In the spirit of exploiting Deep Learning for inference tasks, in this paper, we will use neural networks to approximate BE.The resulting Deep Bucket Elimination (DBE) algorithm is developed for computing the partition function.We provide a proof-of-concept empirically using instances from several different benchmarks, showing that DBE can be a more accurate approximation than current state-of-the-art approaches for approximating BE (e.g. the mini-bucket schemes), especially when problems are sufficiently hard.

    more » « less
  2. Camps-Valls, Gustau ; Ruiz, Francisco J. ; Valera, Isabel (Ed.)
    Bayesian Networks are useful for analyzing the properties of systems with large populations of interacting agents (e.g., in social modeling applications and distributed service applications). These networks typically have large functions (CPTs), making exact inference intractable. However, often these models have additive symmetry. In this paper we show how summation-based CPTs, especially in the presence of symmetry, can be computed efficiently through the usage of the Fast Fourier Transform (FFT). In particular, we propose an efficient method using the FFT for reducing the size of Conditional Probability Tables (CPTs) in Bayesian Networks with summation-based causal independence (CI). We show how to apply it directly towards the acceleration of Bucket Elimination, and we subsequently provide experimental results demonstrating the computational speedup provided by our method. 
    more » « less
  3. null (Ed.)
    Abstract Molecular interaction networks are powerful resources for molecular discovery. They are increasingly used with machine learning methods to predict biologically meaningful interactions. While deep learning on graphs has dramatically advanced the prediction prowess, current graph neural network (GNN) methods are mainly optimized for prediction on the basis of direct similarity between interacting nodes. In biological networks, however, similarity between nodes that do not directly interact has proved incredibly useful in the last decade across a variety of interaction networks. Here, we present SkipGNN, a graph neural network approach for the prediction of molecular interactions. SkipGNN predicts molecular interactions by not only aggregating information from direct interactions but also from second-order interactions, which we call skip similarity. In contrast to existing GNNs, SkipGNN receives neural messages from two-hop neighbors as well as immediate neighbors in the interaction network and non-linearly transforms the messages to obtain useful information for prediction. To inject skip similarity into a GNN, we construct a modified version of the original network, called the skip graph. We then develop an iterative fusion scheme that optimizes a GNN using both the skip graph and the original graph. Experiments on four interaction networks, including drug–drug, drug–target, protein–protein, and gene–disease interactions, show that SkipGNN achieves superior and robust performance. Furthermore, we show that unlike popular GNNs, SkipGNN learns biologically meaningful embeddings and performs especially well on noisy, incomplete interaction networks. 
    more » « less
  4. Deep Neural Networks (DNNs) are pervasively used in a significant number of applications and platforms. To enhance the execution efficiency of large-scale DNNs, previous attempts focus mainly on client-server paradigms, relying on powerful external infrastructure, or model compression, with complicated pre-processing phases. Though effective, these methods overlook the optimization of DNNs on distributed mobile devices. In this work, we design and implement MeDNN, a local distributed mobile computing system with enhanced partitioning and deployment tailored for large-scale DNNs. In MeDNN, we first propose Greedy Two Dimensional Partition (GTDP), which can adaptively partition DNN models onto several mobile devices w.r.t. individual resource constraints. We also propose Structured Model Compact Deployment (SMCD), a mobile-friendly compression scheme which utilizes a structured sparsity pruning technique to further accelerate DNN execution. Experimental results show that, GTDP can accelerate the original DNN execution time by 1.86 – 2.44⇥ with 2 – 4 worker nodes. By utilizing SMCD, 26.5% of additional computing time and 14.2% of extra communication time are saved, on average, with negligible effect on the model accuracy. 
    more » « less
  5. Meila, Marina ; Zhang, Tong (Ed.)
    Federated Learning (FL) is an emerging learning scheme that allows different distributed clients to train deep neural networks together without data sharing. Neural networks have become popular due to their unprecedented success. To the best of our knowledge, the theoretical guarantees of FL concerning neural networks with explicit forms and multi-step updates are unexplored. Nevertheless, training analysis of neural networks in FL is non-trivial for two reasons: first, the objective loss function we are optimizing is non-smooth and non-convex, and second, we are even not updating in the gradient direction. Existing convergence results for gradient descent-based methods heavily rely on the fact that the gradient direction is used for updating. The current paper presents a new class of convergence analysis for FL, Federated Neural Tangent Kernel (FL-NTK), which corresponds to overparamterized ReLU neural networks trained by gradient descent in FL and is inspired by the analysis in Neural Tangent Kernel (NTK). Theoretically, FL-NTK converges to a global-optimal solution at a linear rate with properly tuned learning parameters. Furthermore, with proper distributional assumptions, FL-NTK can also achieve good generalization. The proposed theoretical analysis scheme can be generalized to more complex neural networks. 
    more » « less