We study the sparsity of the solutions to systems of linear Diophantine equations with and without nonnegativity constraints. The sparsity of a solution vector is the number of its nonzero entries, which is referred to as the
Inference in highdimensional linear regression via lattice basis reduction and integer relation detection
We consider the highdimensional linear regression problem, where the algorithmic goal is to efficiently infer an unknown feature vector $\beta^*\in\mathbb{R}^p$ from its linear measurements, using a small number $n$ of samples. Unlike most of the literature, we make no sparsity assumption on $\beta^*$, but instead adopt a different regularization: In the noiseless setting, we assume $\beta^*$ consists of entries, which are either rational numbers with a common denominator $Q\in\mathbb{Z}^+$ (referred to as $Q$rationality); or irrational numbers taking values in a rationally independent set of bounded cardinality, known to learner; collectively called as the mixedrange assumption. Using a novel combination of the Partial Sum of Least Squares (PSLQ) integer relation detection, and the LenstraLenstraLov\'asz (LLL) lattice basis reduction algorithms, we propose a polynomialtime algorithm which provably recovers a $\beta^*\in\mathbb{R}^p$ enjoying the mixedrange assumption, from its linear measurements $Y=X\beta^*\in\mathbb{R}^n$ for a large class of distributions for the random entries of $X$, even with one measurement ($n=1$). In the noisy setting, we propose a polynomialtime, latticebased algorithm, which recovers a $\beta^*\in\mathbb{R}^p$ enjoying the $Q$rationality property, from its noisy measurements $Y=X\beta^*+W\in\mathbb{R}^n$, even from a single sample ($n=1$). We further establish that for large $Q$, and normal noise, this algorithm tolerates informationtheoretically optimal level of more »
 Award ID(s):
 2022448
 Publication Date:
 NSFPAR ID:
 10343454
 Journal Name:
 IEEE transactions on information theory
 ISSN:
 15579654
 Sponsoring Org:
 National Science Foundation
More Like this

Abstract norm of the vector. Our main results are new improved bounds on the minimal$$\ell _0$$ ${\ell}_{0}$ norm of solutions to systems$$\ell _0$$ ${\ell}_{0}$ , where$$A\varvec{x}=\varvec{b}$$ $Ax=b$ ,$$A\in \mathbb {Z}^{m\times n}$$ $A\in {Z}^{m\times n}$ and$${\varvec{b}}\in \mathbb {Z}^m$$ $b\in {Z}^{m}$ is either a general integer vector (lattice case) or a nonnegative integer vector (semigroup case). In certain cases, we give polynomial time algorithms for computing solutions with$$\varvec{x}$$ $x$ norm satisfying the obtained bounds. We show that our bounds are tight. Our bounds can be seen as functions naturally generalizing the rank of a matrix over$$\ell _0$$ ${\ell}_{0}$ , to other subdomains such as$$\mathbb {R}$$ $R$ . We show that these new ranklike functions are all NPhard to compute in general, but polynomialtime computable for fixed number of variables.$$\mathbb {Z}$$ $Z$ 
We provide a computationally and statistically efficient estimator for the classical problem of truncated linear regression, where the dependent variabley=wTx+εand its corresponding vector ofcovariatesx∈Rkare only revealed if the dependent variable falls in some subsetS⊆R; otherwisethe existence of the pair(x,y)is hidden. This problem has remained a challenge since the earlyworks of Tobin (1958); Amemiya (1973); Hausman and Wise (1977); Breen et al. (1996), its applications are abundant, and its history dates back even further to the work of Galton, Pearson, Lee,and Fisher Galton (1897); Pearson and Lee (1908); Lee (1914); Fisher (1931). While consistent estimators of the regression coefficients have been identified, the error rates are not wellunderstood,especially in highdimensional settings.Under a “thickness assumption” about the covariance matrix of the covariates in the revealed sample, we provide a computationally efficient estimator for the coefficient vectorwfromnrevealed samples that attains`2errorO(√k/n), recovering the guarantees of least squares in thestandard (untruncated) linear regression setting. Our estimator uses Projected Stochastic Gradient Descent (PSGD) on the negative loglikelihood of the truncated sample, and only needs oracle access to the setS, which may otherwise be arbitrary, and in particular may be nonconvex.PSGD must be restricted to an appropriately defined convex cone to guarantee that the negativeloglikelihood is stronglymore »

Abstract We consider the problem of computing the partition function $\sum _x e^{f(x)}$ , where $f: \{1, 1\}^n \longrightarrow {\mathbb R}$ is a quadratic or cubic polynomial on the Boolean cube $\{1, 1\}^n$ . In the case of a quadratic polynomial f , we show that the partition function can be approximated within relative error $0 < \epsilon < 1$ in quasipolynomial $n^{O(\ln n  \ln \epsilon )}$ time if the Lipschitz constant of the nonlinear part of f with respect to the $\ell ^1$ metric on the Boolean cube does not exceed $1\delta $ , for any $\delta>0$ , fixed in advance. For a cubic polynomial f , we get the same result under a somewhat stronger condition. We apply the method of polynomial interpolation, for which we prove that $\sum _x e^{\tilde {f}(x)} \ne 0$ for complexvalued polynomials $\tilde {f}$ in a neighborhood of a realvalued f satisfying the above mentioned conditions. The bounds are asymptotically optimal. Results on the zerofree region are interpreted as the absence of a phase transition in the Lee–Yang sense in the corresponding Ising model. The novel feature of the bounds is that they control the total interaction of each vertex but notmore »

We provide a computationally and statistically efficient estimator for the classical problem of truncated linear regression, where the dependent variabley=wTx+εand its corresponding vector ofcovariatesx∈Rkare only revealed if the dependent variable falls in some subsetS⊆R; otherwisethe existence of the pair(x,y)is hidden. This problem has remained a challenge since the earlyworks of Tobin (1958); Amemiya (1973); Hausman and Wise (1977); Breen et al. (1996), its applications are abundant, and its history dates back even further to the work of Galton, Pearson, Lee,and Fisher Galton (1897); Pearson and Lee (1908); Lee (1914); Fisher (1931). While consistent estimators of the regression coefficients have been identified, the error rates are not wellunderstood,especially in highdimensional settings.Under a “thickness assumption” about the covariance matrix of the covariates in the revealedsample, we provide a computationally efficient estimator for the coefficient vectorwfromnrevealed samples that attains`2errorO(√k/n), recovering the guarantees of least squares in thestandard (untruncated) linear regression setting. Our estimator uses Projected Stochastic Gradient Descent (PSGD) on the negative loglikelihood of the truncated sample, and only needs oracle access to the setS, which may otherwise be arbitrary, and in particular may be nonconvex.PSGD must be restricted to an appropriately defined convex cone to guarantee that the negativeloglikelihood is strongly convex,more »

We study a noisy tensor completion problem of broad practical interest, namely, the reconstruction of a lowrank tensor from highly incomplete and randomly corrupted observations of its entries. Whereas a variety of prior work has been dedicated to this problem, prior algorithms either are computationally too expensive for largescale applications or come with suboptimal statistical guarantees. Focusing on “incoherent” and wellconditioned tensors of a constant canonical polyadic rank, we propose a twostage nonconvex algorithm—(vanilla) gradient descent following a rough initialization—that achieves the best of both worlds. Specifically, the proposed nonconvex algorithm faithfully completes the tensor and retrieves all individual tensor factors within nearly linear time, while at the same time enjoying nearoptimal statistical guarantees (i.e., minimal sample complexity and optimal estimation accuracy). The estimation errors are evenly spread out across all entries, thus achieving optimal [Formula: see text] statistical accuracy. We also discuss how to extend our approach to accommodate asymmetric tensors. The insight conveyed through our analysis of nonconvex optimization might have implications for other tensor estimation problems.