skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Sparse representation of vectors in lattices and semigroups
We study the sparsity of the solutions to systems of linear Diophantine equations with and without non-negativity constraints. The sparsity of a solution vector is the number of its nonzero entries, which is referred to as the ℓ0 ℓ 0 -norm of the vector. Our main results are new improved bounds on the minimal ℓ0 ℓ 0 -norm of solutions to systems 𝐴𝑥𝑥=𝑏𝑏 A x x = b b , where 𝐴∈ℤ𝑚×𝑛 A ∈ Z m × n , 𝑏𝑏∈ℤ𝑚 b b ∈ Z m and 𝑥𝑥 x x is either a general integer vector (lattice case) or a non-negative integer vector (semigroup case). In certain cases, we give polynomial time algorithms for computing solutions with ℓ0 ℓ 0 -norm satisfying the obtained bounds. We show that our bounds are tight. Our bounds can be seen as functions naturally generalizing the rank of a matrix over ℝ R , to other subdomains such as ℤ Z . We show that these new rank-like functions are all NP-hard to compute in general, but polynomial-time computable for fixed number of variables.  more » « less
Award ID(s):
1818969
PAR ID:
10341704
Author(s) / Creator(s):
Date Published:
Journal Name:
Mathematical programming
Volume:
192
ISSN:
0025-5610
Page Range / eLocation ID:
519–546
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract We study the sparsity of the solutions to systems of linear Diophantine equations with and without non-negativity constraints. The sparsity of a solution vector is the number of its nonzero entries, which is referred to as the $$\ell _0$$ ℓ 0 -norm of the vector. Our main results are new improved bounds on the minimal $$\ell _0$$ ℓ 0 -norm of solutions to systems $$A\varvec{x}=\varvec{b}$$ A x = b , where $$A\in \mathbb {Z}^{m\times n}$$ A ∈ Z m × n , $${\varvec{b}}\in \mathbb {Z}^m$$ b ∈ Z m and $$\varvec{x}$$ x is either a general integer vector (lattice case) or a non-negative integer vector (semigroup case). In certain cases, we give polynomial time algorithms for computing solutions with $$\ell _0$$ ℓ 0 -norm satisfying the obtained bounds. We show that our bounds are tight. Our bounds can be seen as functions naturally generalizing the rank of a matrix over $$\mathbb {R}$$ R , to other subdomains such as $$\mathbb {Z}$$ Z . We show that these new rank-like functions are all NP-hard to compute in general, but polynomial-time computable for fixed number of variables. 
    more » « less
  2. Bienstock, D. (Ed.)
    Motivated by problems in optimization we study the sparsity of the solutions to systems of linear Diophantine equations and linear integer programs, i.e., the number of non-zero entries of a solution, which is often referred to as the ℓ0-norm. Our main results are improved bounds on the ℓ0-norm of sparse solutions to systems 𝐴𝑥=𝑏, where 𝐴∈ℤ𝑚×𝑛, 𝑏∈ℤ𝑚 and 𝑥 is either a general integer vector (lattice case) or a non-negative integer vector (semigroup case). In the lattice case and certain scenarios of the semigroup case, we give polynomial time algorithms for computing solutions with ℓ0-norm satisfying the obtained bounds. 
    more » « less
  3. Abstract In this paper we prove a result on the effective generation of pluri-canonical linear systems on foliated surfaces of general type. Fix a function {P:\mathbb{Z}_{\geq 0}\to\mathbb{Z}} , then there exists an integer {N>0} such that if {(X,{\mathcal{F}})} is a canonical or nef model of a foliation of general type with Hilbert polynomial {\chi(X,{\mathcal{O}}_{X}(mK_{\mathcal{F}}))=P(m)} for all {m\in\mathbb{Z}_{\geq 0}} , then {|mK_{\mathcal{F}}|} defines a birational map for all {m\geq N} . On the way, we also prove a Grauert–Riemenschneider-type vanishing theorem for foliated surfaces with canonical singularities. 
    more » « less
  4. Krylov subspace methods are a ubiquitous tool for computing near-optimal rank kk approximations of large matrices. While "large block" Krylov methods with block size at least kk give the best known theoretical guarantees, block size one (a single vector) or a small constant is often preferred in practice. Despite their popularity, we lack theoretical bounds on the performance of such "small block" Krylov methods for low-rank approximation. We address this gap between theory and practice by proving that small block Krylov methods essentially match all known low-rank approximation guarantees for large block methods. Via a black-box reduction we show, for example, that the standard single vector Krylov method run for t iterations obtains the same spectral norm and Frobenius norm error bounds as a Krylov method with block size ℓ≥kℓ≥k run for O(t/ℓ)O(t/ℓ) iterations, up to a logarithmic dependence on the smallest gap between sequential singular values. That is, for a given number of matrix-vector products, single vector methods are essentially as effective as any choice of large block size. By combining our result with tail-bounds on eigenvalue gaps in random matrices, we prove that the dependence on the smallest singular value gap can be eliminated if the input matrix is perturbed by a small random matrix. Further, we show that single vector methods match the more complex algorithm of [Bakshi et al. `22], which combines the results of multiple block sizes to achieve an improved algorithm for Schatten pp-norm low-rank approximation. 
    more » « less
  5. null (Ed.)
    We consider the problem of finding a two-layer neural network with sigmoid, rectified linear unit (ReLU), or binary step activation functions that "fits" a training data set as accurately as possible as quantified by the training error; and study the following question: \emph{does a low training error guarantee that the norm of the output layer (outer norm) itself is small?} We answer affirmatively this question for the case of non-negative output weights. Using a simple covering number argument, we establish that under quite mild distributional assumptions on the input/label pairs; any such network achieving a small training error on polynomially many data necessarily has a well-controlled outer norm. Notably, our results (a) have a polynomial (in d) sample complexity, (b) are independent of the number of hidden units (which can potentially be very high), (c) are oblivious to the training algorithm; and (d) require quite mild assumptions on the data (in particular the input vector X∈ℝd need not have independent coordinates). We then leverage our bounds to establish generalization guarantees for such networks through \emph{fat-shattering dimension}, a scale-sensitive measure of the complexity class that the network architectures we investigate belong to. Notably, our generalization bounds also have good sample complexity (polynomials in d with a low degree), and are in fact near-linear for some important cases of interest. 
    more » « less