skip to main content


Title: Polynomial-Time Power-Sum Decomposition of Polynomials
We give efficient algorithms for finding power-sum decomposition of an input polynomial with component s. The case of linear s is equivalent to the well-studied tensor decomposition problem while the quadratic case occurs naturally in studying identifiability of non-spherical Gaussian mixtures from low-order moments. Unlike tensor decomposition, both the unique identifiability and algorithms for this problem are not well-understood. For the simplest setting of quadratic s and , prior work of [GHK15] yields an algorithm only when . On the other hand, the more general recent result of [GKS20] builds an algebraic approach to handle any components but only when is large enough (while yielding no bounds for or even ) and only handles an inverse exponential noise. Our results obtain a substantial quantitative improvement on both the prior works above even in the base case of and quadratic s. Specifically, our algorithm succeeds in decomposing a sum of generic quadratic s for and more generally the th power-sum of generic degree- polynomials for any . Our algorithm relies only on basic numerical linear algebraic primitives, is exact (i.e., obtain arbitrarily tiny error up to numerical precision), and handles an inverse polynomial noise when the s have random Gaussian coefficients.  more » « less
Award ID(s):
2047933
NSF-PAR ID:
10407348
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS)
Page Range / eLocation ID:
956 to 967
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We give new and efficient black-box reconstruction algorithms for some classes of depth-3 arithmetic circuits. As a consequence, we obtain the first efficient algorithm for computing the tensor rank and for finding the optimal tensor decomposition as a sum of rank-one tensors when then input is a constant-rank tensor. More specifically, we provide efficient learning algorithms that run in randomized polynomial time over general fields and in deterministic polynomial time over and for the following classes: 1) Set-multilinear depth-3 circuits of constant top fan-in ((k) circuits). As a consequence of our algorithm, we obtain the first polynomial time algorithm for tensor rank computation and optimal tensor decomposition of constant-rank tensors. This result holds for d dimensional tensors for any d, but is interesting even for d=3. 2) Sums of powers of constantly many linear forms ((k) circuits). As a consequence we obtain the first polynomial-time algorithm for tensor rank computation and optimal tensor decomposition of constant-rank symmetric tensors. 3) Multilinear depth-3 circuits of constant top fan-in (multilinear (k) circuits). Our algorithm works over all fields of characteristic 0 or large enough characteristic. Prior to our work the only efficient algorithms known were over polynomially-sized finite fields (see. Karnin-Shpilka 09’). Prior to our work, the only polynomial-time or even subexponential-time algorithms known (deterministic or randomized) for subclasses of (k) circuits that also work over large/infinite fields were for the setting when the top fan-in k is at most 2 (see Sinha 16’ and Sinha 20’). 
    more » « less
  2. Abstract

    Let$M_{\langle \mathbf {u},\mathbf {v},\mathbf {w}\rangle }\in \mathbb C^{\mathbf {u}\mathbf {v}}{\mathord { \otimes } } \mathbb C^{\mathbf {v}\mathbf {w}}{\mathord { \otimes } } \mathbb C^{\mathbf {w}\mathbf {u}}$denote the matrix multiplication tensor (and write$M_{\langle \mathbf {n} \rangle }=M_{\langle \mathbf {n},\mathbf {n},\mathbf {n}\rangle }$), and let$\operatorname {det}_3\in (\mathbb C^9)^{{\mathord { \otimes } } 3}$denote the determinant polynomial considered as a tensor. For a tensorT, let$\underline {\mathbf {R}}(T)$denote its border rank. We (i) give the first hand-checkable algebraic proof that$\underline {\mathbf {R}}(M_{\langle 2\rangle })=7$, (ii) prove$\underline {\mathbf {R}}(M_{\langle 223\rangle })=10$and$\underline {\mathbf {R}}(M_{\langle 233\rangle })=14$, where previously the only nontrivial matrix multiplication tensor whose border rank had been determined was$M_{\langle 2\rangle }$, (iii) prove$\underline {\mathbf {R}}( M_{\langle 3\rangle })\geq 17$, (iv) prove$\underline {\mathbf {R}}(\operatorname {det}_3)=17$, improving the previous lower bound of$12$, (v) prove$\underline {\mathbf {R}}(M_{\langle 2\mathbf {n}\mathbf {n}\rangle })\geq \mathbf {n}^2+1.32\mathbf {n}$for all$\mathbf {n}\geq 25$, where previously only$\underline {\mathbf {R}}(M_{\langle 2\mathbf {n}\mathbf {n}\rangle })\geq \mathbf {n}^2+1$was known, as well as lower bounds for$4\leq \mathbf {n}\leq 25$, and (vi) prove$\underline {\mathbf {R}}(M_{\langle 3\mathbf {n}\mathbf {n}\rangle })\geq \mathbf {n}^2+1.6\mathbf {n}$for all$\mathbf {n} \ge 18$, where previously only$\underline {\mathbf {R}}(M_{\langle 3\mathbf {n}\mathbf {n}\rangle })\geq \mathbf {n}^2+2$was known. The last two results are significant for two reasons: (i) they are essentially the first nontrivial lower bounds for tensors in an “unbalanced” ambient space and (ii) they demonstrate that the methods we use (border apolarity) may be applied to sequences of tensors.

    The methods used to obtain the results are new and “nonnatural” in the sense of Razborov and Rudich, in that the results are obtained via an algorithm that cannot be effectively applied to generic tensors. We utilize a new technique, calledborder apolaritydeveloped by Buczyńska and Buczyński in the general context of toric varieties. We apply this technique to develop an algorithm that, given a tensorTand an integerr, in a finite number of steps, either outputs that there is no border rankrdecomposition forTor produces a list of all normalized ideals which could potentially result from a border rank decomposition. The algorithm is effectively implementable whenThas a large symmetry group, in which case it outputs potential decompositions in a natural normal form. The algorithm is based on algebraic geometry and representation theory.

     
    more » « less
  3. null (Ed.)
    We give the first polynomial-time algorithm for robust regression in the list-decodable setting where an adversary can corrupt a greater than 1/2 fraction of examples. For any our algorithm takes as input a sample of n linear equations where (1- \alpha) n of the equations are {\em arbitrarily} chosen. It outputs a list that contains a linear function that is close to the truth. Our algorithm succeeds whenever the inliers are chosen from a \emph{certifiably} anti-concentrated distribution D. In particular, this gives an efficient algorithm to find an optimal size list when the inlier distribution is standard Gaussian. For discrete product distributions that are anti-concentrated only in \emph{regular} directions, we give an algorithm that achieves similar guarantee under the promise that the true linear function has all coordinates of the same magnitude. To complement our result, we prove that the anti-concentration assumption on the inliers is information-theoretically necessary. Our algorithm is based on a new framework for list-decodable learning that strengthens the `identifiability to algorithms' paradigm based on the sum-of-squares method. 
    more » « less
  4. Embedding properties of network realizations of dissipative reduced order models Jörn Zimmerling, Mikhail Zaslavsky,Rob Remis, Shasri Moskow, Alexander Mamonov, Murthy Guddati, Vladimir Druskin, and Liliana Borcea Mathematical Sciences Department, Worcester Polytechnic Institute https://www.wpi.edu/people/vdruskin Abstract Realizations of reduced order models of passive SISO or MIMO LTI problems can be transformed to tridiagonal and block-tridiagonal forms, respectively, via dierent modications of the Lanczos algorithm. Generally, such realizations can be interpreted as ladder resistor-capacitor-inductor (RCL) networks. They gave rise to network syntheses in the rst half of the 20th century that was at the base of modern electronics design and consecutively to MOR that tremendously impacted many areas of engineering (electrical, mechanical, aerospace, etc.) by enabling ecient compression of the underlining dynamical systems. In his seminal 1950s works Krein realized that in addition to their compressing properties, network realizations can be used to embed the data back into the state space of the underlying continuum problems. In more recent works of the authors Krein's ideas gave rise to so-called nite-dierence Gaussian quadrature rules (FDGQR), allowing to approximately map the ROM state-space representation to its full order continuum counterpart on a judicially chosen grid. Thus, the state variables can be accessed directly from the transfer function without solving the full problem and even explicit knowledge of the PDE coecients in the interior, i.e., the FDGQR directly learns" the problem from its transfer function. This embedding property found applications in PDE solvers, inverse problems and unsupervised machine learning. Here we show a generalization of this approach to dissipative PDE problems, e.g., electromagnetic and acoustic wave propagation in lossy dispersive media. Potential applications include solution of inverse scattering problems in dispersive media, such as seismic exploration, radars and sonars. To x the idea, we consider a passive irreducible SISO ROM fn(s) = Xn j=1 yi s + σj , (62) assuming that all complex terms in (62) come in conjugate pairs. We will seek ladder realization of (62) as rjuj + vj − vj−1 = −shˆjuj , uj+1 − uj + ˆrj vj = −shj vj , (63) for j = 0, . . . , n with boundary conditions un+1 = 0, v1 = −1, and 4n real parameters hi, hˆi, ri and rˆi, i = 1, . . . , n, that can be considered, respectively, as the equivalent discrete inductances, capacitors and also primary and dual conductors. Alternatively, they can be viewed as respectively masses, spring stiness, primary and dual dampers of a mechanical string. Reordering variables would bring (63) into tridiagonal form, so from the spectral measure given by (62 ) the coecients of (63) can be obtained via a non-symmetric Lanczos algorithm written in J-symmetric form and fn(s) can be equivalently computed as fn(s) = u1. The cases considered in the original FDGQR correspond to either (i) real y, θ or (ii) real y and imaginary θ. Both cases are covered by the Stieltjes theorem, that yields in case (i) real positive h, hˆ and trivial r, rˆ, and in case (ii) real positive h,r and trivial hˆ,rˆ. This result allowed us a simple interpretation of (62) as the staggered nite-dierence approximation of the underlying PDE problem [2]. For PDEs in more than one variables (including topologically rich data-manifolds), a nite-dierence interpretation is obtained via a MIMO extensions in block form, e.g., [4, 3]. The main diculty of extending this approach to general passive problems is that the Stieltjes theory is no longer applicable. Moreover, the tridiagonal realization of a passive ROM transfer function (62) via the ladder network (63) cannot always be obtained in port-Hamiltonian form, i.e., the equivalent primary and dual conductors may change sign [1]. 100 Embedding of the Stieltjes problems, e.g., the case (i) was done by mapping h and hˆ into values of acoustic (or electromagnetic) impedance at grid cells, that required a special coordinate stretching (known as travel time coordinate transform) for continuous problems. Likewise, to circumvent possible non-positivity of conductors for the non-Stieltjes case, we introduce an additional complex s-dependent coordinate stretching, vanishing as s → ∞ [1]. This stretching applied in the discrete setting induces a diagonal factorization, removes oscillating coecients, and leads to an accurate embedding for moderate variations of the coecients of the continuum problems, i.e., it maps discrete coecients onto the values of their continuum counterparts. Not only does this embedding yields an approximate linear algebraic algorithm for the solution of the inverse problems for dissipative PDEs, it also leads to new insight into the properties of their ROM realizations. We will also discuss another approach to embedding, based on Krein-Nudelman theory [5], that results in special data-driven adaptive grids. References [1] Borcea, Liliana and Druskin, Vladimir and Zimmerling, Jörn, A reduced order model approach to inverse scattering in lossy layered media, Journal of Scientic Computing, V. 89, N1, pp. 136,2021 [2] Druskin, Vladimir and Knizhnerman, Leonid, Gaussian spectral rules for the three-point second dierences: I. A two-point positive denite problem in a semi-innite domain, SIAM Journal on Numerical Analysis, V. 37, N 2, pp.403422, 1999 [3] Druskin, Vladimir and Mamonov, Alexander V and Zaslavsky, Mikhail, Distance preserving model order reduction of graph-Laplacians and cluster analysis, Druskin, Vladimir and Mamonov, Alexander V and Zaslavsky, Mikhail, Journal of Scientic Computing, V. 90, N 1, pp 130, 2022 [4] Druskin, Vladimir and Moskow, Shari and Zaslavsky, Mikhail LippmannSchwingerLanczos algorithm for inverse scattering problems, Inverse Problems, V. 37, N. 7, 2021, [5] Mark Adolfovich Nudelman The Krein String and Characteristic Functions of Maximal Dissipative Operators, Journal of Mathematical Sciences, 2004, V 124, pp 49184934 Go back to Plenary Speakers Go back to Speakers Go back 
    more » « less
  5. Abstract Linear structural equation models relate the components of a random vector using linear interdependencies and Gaussian noise. Each such model can be naturally associated with a mixed graph whose vertices correspond to the components of the random vector. The graph contains directed edges that represent the linear relationships between components, and bidirected edges that encode unobserved confounding. We study the problem of generic identifiability, that is, whether a generic choice of linear and confounding effects can be uniquely recovered from the joint covariance matrix of the observed random vector. An existing combinatorial criterion for establishing generic identifiability is the half-trek criterion (HTC), which uses the existence of trek systems in the mixed graph to iteratively discover generically invertible linear equation systems in polynomial time. By focusing on edges one at a time, we establish new sufficient and new necessary conditions for generic identifiability of edge effects extending those of the HTC. In particular, we show how edge coefficients can be recovered as quotients of subdeterminants of the covariance matrix, which constitutes a determinantal generalization of formulas obtained when using instrumental variables for identification. While our results do not completely close the gap between existing sufficient and necessary conditions we find, empirically, that our results allow us to prove the generic identifiability of many more mixed graphs than the prior state-of-the-art. 
    more » « less