Inference in high-dimensional linear regression via lattice basis reduction and integer relation detection

Gamarnik, D.; Kizildag, E.C.; Ilias Zadik, I.

doi:10.1109/TIT.2021.3113921

We consider the high-dimensional linear regression problem, where the algorithmic goal is to efficiently infer an unknown feature vector $$\beta^*\in\mathbb{R}^p$$ from its linear measurements, using a small number $$n$$ of samples. Unlike most of the literature, we make no sparsity assumption on $$\beta^*$$, but instead adopt a different regularization: In the noiseless setting, we assume $$\beta^*$$ consists of entries, which are either rational numbers with a common denominator $$Q\in\mathbb{Z}^+$$ (referred to as $Q-$$rationality); or irrational numbers taking values in a rationally independent set of bounded cardinality, known to learner; collectively called as the mixed-range assumption. Using a novel combination of the Partial Sum of Least Squares (PSLQ) integer relation detection, and the Lenstra-Lenstra-Lov\'asz (LLL) lattice basis reduction algorithms, we propose a polynomial-time algorithm which provably recovers a $$\beta^*\in\mathbb{R}^p$ enjoying the mixed-range assumption, from its linear measurements $$Y=X\beta^*\in\mathbb{R}^n$$ for a large class of distributions for the random entries of $$X$$, even with one measurement ($n=1$). In the noisy setting, we propose a polynomial-time, lattice-based algorithm, which recovers a $$\beta^*\in\mathbb{R}^p$$ enjoying the $Q-$rationality property, from its noisy measurements $$Y=X\beta^*+W\in\mathbb{R}^n$$, even from a single sample ($n=1$). We further establish that for large $$Q$$, and normal noise, this algorithm tolerates information-theoretically optimal level of noise. We then apply these ideas to develop a polynomial-time, single-sample algorithm for the phase retrieval problem. Our methods address the single-sample ($n=1$) regime, where the sparsity-based methods such as the Least Absolute Shrinkage and Selection Operator (LASSO) and the Basis Pursuit are known to fail. Furthermore, our results also reveal algorithmic connections between the high-dimensional linear regression problem, and the integer relation detection, randomized subset-sum, and shortest vector problems.

More Like this