We consider the problem of estimating a $$p$$ -dimensional vector $$\beta$$ from $$n$$ observations $$Y=X\beta+W$$ , where $$\beta_{j}\mathop{\sim}^{\mathrm{i.i.d}.}\pi$$ for a real-valued distribution $$\pi$$ with zero mean and unit variance’ $$X_{ij}\mathop{\sim}^{\mathrm{i.i.d}.}\mathcal{N}(0,1)$$ , and $$W_{i}\mathop{\sim}^{\mathrm{i.i.d}.}\mathcal{N}(0,\ \sigma^{2})$$ . In the asymptotic regime where $$n/p\rightarrow\delta$$ and $$p/\sigma^{2}\rightarrow$$ snr for two fixed constants $$\delta,\ \mathsf{snr}\in(0,\ \infty)$$ as $$p\rightarrow\infty$$ , the limiting (normalized) minimum mean-squared error (MMSE) has been characterized by a single-letter (additive Gaussian scalar) channel. In this paper, we show that if the MMSE function of the single-letter channel converges to a step function, then the limiting MMSE of estimating $$\beta$$ converges to a step function which jumps from 1 to 0 at a critical threshold. Moreover, we establish that the limiting mean-squared error of the (MSE-optimal) approximate message passing algorithm also converges to a step function with a larger threshold, providing evidence for the presence of a computational-statistical gap between the two thresholds.
more »
« less
Prevalence of deficiency-zero reaction networks in an Erdös–Rényi framework
Abstract Reaction networks are commonly used within the mathematical biology and mathematical chemistry communities to model the dynamics of interacting species. These models differ from the typical graphs found in random graph theory since their vertices are constructed from elementary building blocks, i.e. the species. We consider these networks in an Erdös–Rényi framework and, under suitable assumptions, derive a threshold function for the network to have a deficiency of zero, which is a property of great interest in the reaction network community. Specifically, if the number of species is denoted by n and the edge probability by $$p_n$$ , then we prove that the probability of a random binary network being deficiency zero converges to 1 if $$p_n\ll r(n)$$ as $$n \to \infty$$ , and converges to 0 if $$p_n \gg r(n)$$ as $$n \to \infty$$ , where $$r(n)=\frac{1}{n^3}$$ .
more »
« less
- Award ID(s):
- 2051498
- PAR ID:
- 10344870
- Date Published:
- Journal Name:
- Journal of Applied Probability
- Volume:
- 59
- Issue:
- 2
- ISSN:
- 0021-9002
- Page Range / eLocation ID:
- 384 to 398
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Motivated by both theory and practice, we study how random pruning of the weights affects a neural network's neural tangent kernel (NTK). In particular, this work establishes an equivalence of the NTKs between a fully-connected neural network and its randomly pruned version. The equivalence is established under two cases. The first main result studies the infinite-width asymptotic. It is shown that given a pruning probability, for fully-connected neural networks with the weights randomly pruned at the initialization, as the width of each layer grows to infinity sequentially, the NTK of the pruned neural network converges to the limiting NTK of the original network with some extra scaling. If the network weights are rescaled appropriately after pruning, this extra scaling can be removed. The second main result considers the finite-width case. It is shown that to ensure the NTK's closeness to the limit, the dependence of width on the sparsity parameter is asymptotically linear, as the NTK's gap to its limit goes down to zero. Moreover, if the pruning probability is set to zero (i.e., no pruning), the bound on the required width matches the bound for fully-connected neural networks in previous works up to logarithmic factors. The proof of this result requires developing a novel analysis of a network structure which we called mask-induced pseudo-networks. Experiments are provided to evaluate our results.more » « less
-
null (Ed.)In this work, we design a type of controller that consists of adding a specific set of reactions to an existing mass-action chemical reaction network in order to control a target species. This set of reactions is effective for both deterministic and stochastic networks, in the latter case controlling the mean as well as the variance of the target species. We employ a type of network property called absolute concentration robustness (ACR). We provide applications to the control of a multisite phosphorylation model as well as a receptor–ligand signalling system. For this framework, we use the so-called deficiency zero theorem from chemical reaction network theory as well as multiscaling model reduction methods. We show that the target species has approximately Poisson distribution with the desired mean. We further show that ACR controllers can bring robust perfect adaptation to a target species and are complementary to a recently introduced antithetic feedback controller used for stochastic chemical reactions.more » « less
-
We study the problem of robust multivariate polynomial regression: let p\colon\mathbb{R}^n\to\mathbb{R} be an unknown n-variate polynomial of degree at most d in each variable. We are given as input a set of random samples (\mathbf{x}_i,y_i) \in [-1,1]^n \times \mathbb{R} that are noisy versions of (\mathbf{x}_i,p(\mathbf{x}_i)). More precisely, each \mathbf{x}_i is sampled independently from some distribution \chi on [-1,1]^n, and for each i independently, y_i is arbitrary (i.e., an outlier) with probability at most \rho < 1/2, and otherwise satisfies |y_i-p(\mathbf{x}_i)|\leq\sigma. The goal is to output a polynomial \hat{p}, of degree at most d in each variable, within an \ell_\infty-distance of at most O(\sigma) from p. Kane, Karmalkar, and Price [FOCS'17] solved this problem for n=1. We generalize their results to the n-variate setting, showing an algorithm that achieves a sample complexity of O_n(d^n\log d), where the hidden constant depends on n, if \chi is the n-dimensional Chebyshev distribution. The sample complexity is O_n(d^{2n}\log d), if the samples are drawn from the uniform distribution instead. The approximation error is guaranteed to be at most O(\sigma), and the run-time depends on \log(1/\sigma). In the setting where each \mathbf{x}_i and y_i are known up to N bits of precision, the run-time's dependence on N is linear. We also show that our sample complexities are optimal in terms of d^n. Furthermore, we show that it is possible to have the run-time be independent of 1/\sigma, at the cost of a higher sample complexity.more » « less
-
Abstract An $$n \times n$$ matrix with $$\pm 1$$ entries that acts on $${\mathbb {R}}^{n}$$ as a scaled isometry is called Hadamard. Such matrices exist in some, but not all dimensions. Combining number-theoretic and probabilistic tools, we construct matrices with $$\pm 1$$ entries that act as approximate scaled isometries in $${\mathbb {R}}^{n}$$ for all $$n \in {\mathbb {N}}$$. More precisely, the matrices we construct have condition numbers bounded by a constant independent of $$n$$. Using this construction, we establish a phase transition for the probability that a random frame contains a Riesz basis. Namely, we show that a random frame in $${\mathbb {R}}^{n}$$ formed by $$N$$ vectors with independent identically distributed coordinate having a nondegenerate symmetric distribution contains many Riesz bases with high probability provided that $$N \ge \exp (Cn)$$. On the other hand, we prove that if the entries are sub-Gaussian, then a random frame fails to contain a Riesz basis with probability close to $$1$$ whenever $$N \le \exp (cn)$$, where $c<C$ are constants depending on the distribution of the entries.more » « less
An official website of the United States government

