 Award ID(s):
 1908774
 Publication Date:
 NSFPAR ID:
 10299301
 Journal Name:
 Proeedings of the International Workshop on Artificial Intelligence and Statistics
 Volume:
 130
 Page Range or eLocationID:
 27282736
 ISSN:
 1525531X
 Sponsoring Org:
 National Science Foundation
More Like this

We investigate the approximability of the following optimization problem. The input is an n× n matrix A=(Aij) with real entries and an originsymmetric convex body K⊂ ℝn that is given by a membership oracle. The task is to compute (or approximate) the maximum of the quadratic form ∑i=1n∑j=1n Aij xixj=⟨ x,Ax⟩ as x ranges over K. This is a rich and expressive family of optimization problems; for different choices of matrices A and convex bodies K it includes a diverse range of optimization problems like maxcut, Grothendieck/noncommutative Grothendieck inequalities, small set expansion and more. While the literature studied these special cases using casespecific reasoning, here we develop a general methodology for treatment of the approximability and inapproximability aspects of these questions. The underlying geometry of K plays a critical role; we show under commonly used complexity assumptions that polytime constantapproximability necessitates that K has type2 constant that grows slowly with n. However, we show that even when the type2 constant is bounded, this problem sometimes exhibits strong hardness of approximation. Thus, even within the realm of type2 bodies, the approximability landscape is nuanced and subtle. However, the link that we establish between optimization and geometry of Banach spaces allows usmore »

Abstract We consider the problem of covering multiple submodular constraints. Given a finite ground set
N , a weight function ,$$w: N \rightarrow \mathbb {R}_+$$ $w:N\to {R}_{+}$r monotone submodular functions over$$f_1,f_2,\ldots ,f_r$$ ${f}_{1},{f}_{2},\dots ,{f}_{r}$N and requirements the goal is to find a minimum weight subset$$k_1,k_2,\ldots ,k_r$$ ${k}_{1},{k}_{2},\dots ,{k}_{r}$ such that$$S \subseteq N$$ $S\subseteq N$ for$$f_i(S) \ge k_i$$ ${f}_{i}\left(S\right)\ge {k}_{i}$ . We refer to this problem as$$1 \le i \le r$$ $1\le i\le r$MultiSubmodCover and it was recently considered by HarPeled and Jones (Few cuts meet many point sets. CoRR.arxiv:abs1808.03260 HarPeled and Jones 2018) who were motivated by an application in geometry. Even with$$r=1$$ $r=1$MultiSubmodCover generalizes the wellknown Submodular Set Cover problem (SubmodSC ), and it can also be easily reduced toSubmodSC . A simple greedy algorithm gives an approximation where$$O(\log (kr))$$ $O(log(kr\left)\right)$ and this ratio cannot be improved in the general case. In this paper, motivated by several concrete applications, we consider two ways to improve upon the approximation given by the greedy algorithm. First, we give a bicriteria approximation algorithm for$$k = \sum _i k_i$$ $k={\sum}_{i}{k}_{i}$MultiSubmodCover that covers each constraint to within a factor of while incurring an approximation of$$(11/e\varepsilon )$$ $(11/e\epsilon )$ in the cost. Second, we consider the special case when each$$O(\frac{1}{\epsilon }\log r)$$ $O(\frac{1}{\u03f5}logr)$ is a obtained from a truncated coverage function and obtain an algorithm that generalizes previous work on partial set cover ($$f_i$$ ${f}_{i}$PartialSC ), covering integer programs (CIPs ) and multiple vertex cover constraintsmore » 
Nonnegative matrix factorization (NMF), the problem of finding two nonnegative lowrank factors whose product approximates an input matrix, is a useful tool for many data mining and scientific applications such as topic modeling in text mining and unmixing in microscopy. In this paper, we focus on scaling algorithms for NMF to very large sparse datasets and massively parallel machines by employing effective algorithms, communication patterns, and partitioning schemes that leverage the sparsity of the input matrix. We consider two previous works developed for related problems,one that uses a finegrained partitioning strategy using a pointtopoint communication pattern and one that uses a Cartesian, or checkerboard, partitioning strategy using a collectivebased communication pattern. We show that a combination of the previous approaches balances the demands of the various computations within NMF algorithms and achieves high efficiency and scalability. From the experiments, we see that our proposed strategy runs up to 10x faster than the state of the art on realworld datasets.

Abstract In a Merlin–Arthur proof system, the proof verifier (Arthur) accepts valid proofs (from Merlin) with probability 1, and rejects invalid proofs with probability arbitrarily close to 1. The running time of such a system is defined to be the length of Merlin’s proof plus the running time of Arthur. We provide new Merlin–Arthur proof systems for some key problems in finegrained complexity. In several cases our proof systems have optimal running time. Our main results include:
Certifying that a list of
n integers has no 3SUM solution can be done in Merlin–Arthur time . Previously, Carmosino et al. [ITCS 2016] showed that the problem has a nondeterministic algorithm running in$$\tilde{O}(n)$$ $\stackrel{~}{O}\left(n\right)$ time (that is, there is a proof system with proofs of length$$\tilde{O}(n^{1.5})$$ $\stackrel{~}{O}\left({n}^{1.5}\right)$ and a deterministic verifier running in$$\tilde{O}(n^{1.5})$$ $\stackrel{~}{O}\left({n}^{1.5}\right)$ time).$$\tilde{O}(n^{1.5})$$ $\stackrel{~}{O}\left({n}^{1.5}\right)$Counting the number of
k cliques with total edge weight equal to zero in ann node graph can be done in Merlin–Arthur time (where$${\tilde{O}}(n^{\lceil k/2\rceil })$$ $\stackrel{~}{O}\left({n}^{\lceil k/2\rceil}\right)$ ). For odd$$k\ge 3$$ $k\ge 3$k , this bound can be further improved for sparse graphs: for example, counting the number of zeroweight triangles in anm edge graph can be done in Merlin–Arthur time . Previous Merlin–Arthur protocols by Williams [CCC’16] and Björklund and Kaski [PODC’16] could only count$${\tilde{O}}(m)$$ $\stackrel{~}{O}\left(m\right)$k cliques in unweighted graphs, and had worse running times for smallk .Computing the AllPairsmore »
Certifying that an
n variablek CNF is unsatisfiable can be done in Merlin–Arthur time . We also observe an algebrization barrier for the previous$$2^{n/2  n/O(k)}$$ ${2}^{n/2n/O\left(k\right)}$ time Merlin–Arthur protocol of R. Williams [CCC’16] for$$2^{n/2}\cdot \textrm{poly}(n)$$ ${2}^{n/2}\xb7\text{poly}\left(n\right)$ SAT: in particular, his protocol algebrizes, and we observe there is no algebrizing protocol for$$\#$$ $\#$k UNSAT running in time. Therefore we have to exploit nonalgebrizing properties to obtain our new protocol.$$2^{n/2}/n^{\omega (1)}$$ ${2}^{n/2}/{n}^{\omega \left(1\right)}$ Due to the centrality of these problems in finegrained complexity, our results have consequences for many other problems of interest. For example, our work implies that certifying there is no Subset Sum solution toCertifying a Quantified Boolean Formula is true can be done in Merlin–Arthur time
. Previously, the only nontrivial result known along these lines was an Arthur–Merlin–Arthur protocol (where Merlin’s proof depends on some of Arthur’s coins) running in$$2^{4n/5}\cdot \textrm{poly}(n)$$ ${2}^{4n/5}\xb7\text{poly}\left(n\right)$ time.$$2^{2n/3}\cdot \textrm{poly}(n)$$ ${2}^{2n/3}\xb7\text{poly}\left(n\right)$n integers can be done in Merlin–Arthur time , improving on the previous best protocol by Nederlof [IPL 2017] which took$$2^{n/3}\cdot \textrm{poly}(n)$$ ${2}^{n/3}\xb7\text{poly}\left(n\right)$ time.$$2^{0.49991n}\cdot \textrm{poly}(n)$$ ${2}^{0.49991n}\xb7\text{poly}\left(n\right)$ 
Nonnegative Matrix Factorization (NMF) is broadly used to determine class membership in a variety of clustering applications. From movie recommendations and image clustering to visual feature extractions, NMF has applications to solve a large number of knowledge discovery and data mining problems. Traditional optimization methods, such as the Multiplicative Updating Algorithm (MUA), solves the NMF problem by utilizing an auxiliary function to ensure that the objective monotonically decreases. Although the objective in MUA converges, there exists no proof to show that the learned matrix factors converge as well. Without this rigorous analysis, the clustering performance and stability of the NMF algorithms cannot be guaranteed. To address this knowledge gap, in this article, we study the factorbounded NMF problem and provide a solution algorithm with proven convergence by rigorous mathematical analysis, which ensures that both the objective and matrix factors converge. In addition, we show the relationship between MUA and our solution followed by an analysis of the convergence of MUA. Experiments on both toy data and realworld datasets validate the correctness of our proposed method and its utility as an effective clustering algorithm.