Recursive calls over recursive data are useful for generating probability distributions, and probabilistic programming allows computations over these distributions to be expressed in a modular and intuitive way. Exact inference is also useful, but unfortunately, existing probabilistic programming languages do not perform exact inference on recursive calls over recursive data, forcing programmers to code many applications manually. We introduce a probabilistic language in which a wide variety of recursion can be expressed naturally, and inference carried out exactly. For instance, probabilistic pushdown automata and their generalizations are easy to express, and polynomialtime parsing algorithms for them are derived automatically. We eliminate recursive data types using program transformations related to defunctionalization and refunctionalization. These transformations are assured correct by a linear type system, and a successful choice of transformations, if there is one, is guaranteed to be found by a greedy algorithm.
more »
« less
Compact Representation of Uncertainty in Clustering
For many classic structured prediction problems, probability distributions over the dependent variables can be efficiently computed using widelyknown algorithms and data structures (such as forwardbackward, and its corresponding trellis for exact probability distributions in Markov models). However, we know of no previ ous work studying efficient representations of exact distributions over clusterings. This paper presents definitions and proofs for a dynamicprogramming inference procedure that computes the partition function, the marginal probability of a cluster, and the MAP clustering—all exactly. Rather than the N th Bell number, these exact solutions take time and space proportional to the substantially smaller powerset of N . Indeed, we improve upon the time complexity of the algorithm introduced by Kohonen and Corander [11] for this problem by a factor of N. While still large, this previously unknown result is intellectually interesting in its own right, makes feasible exact inference for important realworld small data applications (such as medicine), and provides a natural stepping stone towards sparsetrellis approximations that enable further scalability (which we also explore). In experi ments, we demonstrate the superiority of our approach over approximate methods in analyzing realworld gene expression data used in cancer treatment.
more »
« less
 Award ID(s):
 1637536
 NSFPAR ID:
 10112136
 Date Published:
 Journal Name:
 Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018
 Page Range / eLocation ID:
 86398649
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


Failure time data of fielded systems are usually obtained from the actual users of the systems. Due to various operational preferences and/or technical obstacles, a large proportion of field data are collected as aggregate data instead of the exact failure times of individual units. The challenge of using such data is that the obtained information is more concise but less precise in comparison to using individual failure times. The most significant needs in modeling aggregate failure time data are the selection of an appropriate probability distribution and the development of a statistical inference procedure capable of handling data aggregation. Although some probability distributions, such as the Gamma and Inverse Gaussian distributions, have wellknown closedform expressions for the probability density function for aggregate data, the use of such distributions limits the applications in field reliability estimation. For reliability practitioners, it would be invaluable to use a robust approach to handle aggregate failure time data without being limited to a small number of probability distributions. This paper studies the application of phasetype (PH) distribution as a candidate for modeling aggregate failure time data. An expectationmaximization algorithm is developed to obtain the maximum likelihood estimates of model parameters, and the confidence interval for the reliability estimate is also obtained. The simulation and numerical studies show that the robust approach is quite powerful because of the high capability of PH distribution in mimicking a variety of probability distributions. In the area of reliability engineering, there is limited work on modeling aggregate data for field reliability estimation. The analytical and statistical inference methods described in this work provide a robust tool for analyzing aggregate failure time data for the first time.more » « less

One of the primary challenges in graphical models is inference, or reconstructing a marginal probability from the graphical model’s factorized representation. While tractable for some graphs, the cost of inference grows exponentially with the graphical model’s complexity, necessitating approximation for more complex graphs. For interactive applications, latency is the dominant concern, making approximate inference the only feasible option. Unfortunately, approximate inference can be wasteful for interactive applications, as exact inference can still converge faster, even for moderately complex inference problems. In this paper, we propose a new family of convergent inference algorithms (CIAs) that bridge the gap between approximations and exact solutions, providing early, incrementally improving approximations that become exact after a finite period of time. We describe two specific CIAs based on a cryptographic technique called linear congruential generators, including a novel incremental join algorithm for dense relations called Leaky Joins. We conclude with experiments that demonstrate the utility of Leaky Joins for convergent inference: On both synthetic and realworld probabilistic graphical models, Leaky Joins converge to exact marginal probabilities almost as fast as state of the art exact inference algorithms, while simultaneously achieving approximations that are almost as good as state of the art approximation algorithms.more » « less

Stein variational gradient descent (SVGD) is a particlebased inference algorithm that leverages gradient information for efficient approximate inference. In this work, we enhance SVGD by leveraging preconditioning matrices, such as the Hessian and Fisher information matrix, to incorporate geometric information into SVGD updates. We achieve this by presenting a generalization of SVGD that replaces the scalarvalued kernels in vanilla SVGD with more general matrixvalued kernels. This yields a significant extension of SVGD, and more importantly, allows us to flexibly incorporate various preconditioning matrices to accelerate the exploration in the probability landscape. Empirical results show that our method outperforms vanilla SVGD and a variety of baseline approaches over a range of realworld Bayesian inference tasks.more » « less

Inference of unknown opinions with uncertain, adversarial (e.g., incorrect or conflicting) evidence in large datasets is not a trivial task. Without proper handling, it can easily mislead decision making in data mining tasks. In this work, we propose a highly scalable opinion inference probabilistic model, namely Adversarial Collective Opinion Inference (AdvCOI), which provides a solution to infer unknown opinions with high scalability and robustness under the presence of uncertain, adversarial evidence by enhancing Collective Subjective Logic (CSL) which is developed by combining SL and Probabilistic Soft Logic (PSL). The key idea behind the AdvCOI is to learn a model of robust ways against uncertain, adversarial evidence which is formulated as a minmax problem. We validate the outperformance of the AdvCOI compared to baseline models and its competitive counterparts under possible adversarial attacks on the logicrule based structured data and white and black box adversarial attacks under both clean and perturbed semisynthetic and realworld datasets in three real world applications. The results show that the AdvCOI generates the lowest mean absolute error in the expected truth probability while producing the lowest running time among all.more » « less