skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, October 10 until 2:00 AM ET on Friday, October 11 due to maintenance. We apologize for the inconvenience.


Title: Tree convolution for probability distributions with unbounded support
We develop the complex-analytic viewpoint on the tree convolutions studied by the second author and Weihua Liu in Jekel and Liu (2020), which generalize the free, boolean, monotone, and orthogonal convolutions. In particular, for each rooted subtree T of the N-regular tree (with vertices labeled by alternating strings), we define the convolution \boxplus_T (µ1, . . . , µN) for arbitrary probability measures µ1, . . . , µN on R using a certain fixed-point equation for the Cauchy transforms. The convolution operations respect the operad structure of the tree operad from Jekel and Liu (2020). We prove a general limit theorem for iterated T -free convolution similar to Bercovici and Pata’s results in the free case (Bercovici and Pata 1999), and we deduce limit theorems for measures in the domain of attraction of each of the classical stable laws.  more » « less
Award ID(s):
2002826
NSF-PAR ID:
10432681
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Alea
Volume:
18
Issue:
2
ISSN:
1980-0436
Page Range / eLocation ID:
1585–1623
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    We study three convolutions of polynomials in the context of free probability theory. We prove that these convolutions can be written as the expected characteristic polynomials of sums and products of unitarily invariant random matrices. The symmetric additive and multiplicative convolutions were introduced by Walsh and Szegö in different contexts, and have been studied for a century. The asymmetric additive convolution, and the connection of all of them with random matrices, is new. By developing the analogy with free probability, we prove that these convolutions produce real rooted polynomials and provide strong bounds on the locations of the roots of these polynomials.

     
    more » « less
  2. We introduce a framework to study discrete-variable (DV) quantum systems based on qudits. It relies on notions of a mean state (MS), a minimal stabilizer-projection state (MSPS), and a new convolution. Some interesting consequences are: The MS is the closest MSPS to a given state with respect to the relative entropy; the MS is extremal with respect to the von Neumann entropy, demonstrating a “maximal entropy principle in DV systems.” We obtain a series of inequalities for quantum entropies and for Fisher information based on convolution, giving a “second law of thermodynamics for quantum convolutions.” We show that the convolution of two stabilizer states is a stabilizer state. We establish a central limit theorem, based on iterating the convolution of a zero-mean quantum state, and show this converges to its MS. The rate of convergence is characterized by the “magic gap,” which we define in terms of the support of the characteristic function of the state. We elaborate on two examples: the DV beam splitter and the DV amplifier.

     
    more » « less
  3. Flow-based generative models have recently become one of the most efficient approaches to model data generation. Indeed, they are constructed with a sequence of invertible and tractable transformations. Glow first introduced a simple type of generative flow using an invertible 1×1 convolution. However, the 1×1 convolution suffers from limited flexibility compared to the standard convolutions. In this paper, we propose a novel invertible n×n convolution approach that overcomes the limitations of the invertible 1×1 convolution. In addition, our proposed network is not only tractable and invertible but also uses fewer parameters than standard convolutions. The experiments on CIFAR-10, ImageNet and Celeb-HQ datasets, have shown that our invertible n×n convolution helps to improve the performance of generative models significantly. 
    more » « less
  4. In this paper, we present NESTA, a specialized Neural engine that significantly accelerates the computation of convolution layers in a deep convolutional neural network, while reducing the computational energy. NESTA reformats Convolutions into 3 × 3 batches and uses a hierarchy of Hamming Weight Compressors to process each batch. Besides, when processing the convolution across multiple channels, NESTA, rather than computing the precise result of a convolution per channel, quickly computes an approximation of its partial sum, and a residual value such that if added to the approximate partial sum, generates the accurate output. Then, instead of immediately adding the residual, it uses (consumes) the residual when processing the next batch in the hamming weight compressors with available capacity. This mechanism shortens the critical path by avoiding the need to propagate carry signals during each round of computation and speeds up the convolution of each channel. In the last stage of computation, when the partial sum of the last channel is computed, NESTA terminates by adding the residual bits to the approximate output to generate a correct result. 
    more » « less
  5. Deep convolutional neural networks have revolutionized many machine learning and computer vision tasks, however, some remaining key challenges limit their wider use. These challenges include improving the network's robustness to perturbations of the input image and the limited ``field of view'' of convolution operators. We introduce the IMEXnet that addresses these challenges by adapting semi-implicit methods for partial differential equations. Compared to similar explicit networks, such as residual networks, our network is more stable, which has recently shown to reduce the sensitivity to small changes in the input features and improve generalization. The addition of an implicit step connects all pixels in each channel of the image and therefore addresses the field of view problem while still being comparable to standard convolutions in terms of the number of parameters and computational complexity. We also present a new dataset for semantic segmentation and demonstrate the effectiveness of our architecture using the NYU Depth dataset. 
    more » « less