This content will become publicly available on January 16, 2025
- NSF-PAR ID:
- 10539920
- Publisher / Repository:
- International Conference of Learning Representations
- Date Published:
- ISBN:
- 9781713872740
- Format(s):
- Medium: X
- Location:
- International Conference of Learning Representations
- Sponsoring Org:
- National Science Foundation
More Like this
-
null (Ed.)Differential privacy is a mathematical framework for developing statistical computations with provable guarantees of privacy and accuracy. In contrast to the privacy component of differential privacy, which has a clear mathematical and intuitive meaning, the accuracy component of differential privacy does not have a generally accepted definition; accuracy claims of differential privacy algorithms vary from algorithm to algorithm and are not instantiations of a general definition. We identify program discontinuity as a common theme in existing ad hoc definitions and introduce an alternative notion of accuracy parametrized by, what we call, — the of an input x w.r.t. a deterministic computation f and a distance d , is the minimal distance d ( x , y ) over all y such that f ( y )≠ f ( x ). We show that our notion of accuracy subsumes the definition used in theoretical computer science, and captures known accuracy claims for differential privacy algorithms. In fact, our general notion of accuracy helps us prove better claims in some cases. Next, we study the decidability of accuracy. We first show that accuracy is in general undecidable. Then, we define a non-trivial class of probabilistic computations for which accuracy is decidable (unconditionally, or assuming Schanuel’s conjecture). We implement our decision procedure and experimentally evaluate the effectiveness of our approach for generating proofs or counterexamples of accuracy for common algorithms from the literature.more » « less
-
We consider using gradient descent to minimize the nonconvex function $f(X)=\phi(XX^{T})$ over an $n\times r$ factor matrix $X$, in which $\phi$ is an underlying smooth convex cost function defined over $n\times n$ matrices. While only a second-order stationary point $X$ can be provably found in reasonable time, if $X$ is additionally \emph{rank deficient}, then its rank deficiency certifies it as being globally optimal. This way of certifying global optimality necessarily requires the search rank $r$ of the current iterate $X$ to be \emph{overparameterized} with respect to the rank $r^{\star}$ of the global minimizer $X^{\star}$. Unfortunately, overparameterization significantly slows down the convergence of gradient descent, from a linear rate with $r=r^{\star}$ to a sublinear rate when $r>r^{\star}$, even when $\phi$ is strongly convex. In this paper, we propose an inexpensive preconditioner that restores the convergence rate of gradient descent back to linear in the overparameterized case, while also making it agnostic to possible ill-conditioning in the global minimizer $X^{\star}$.more » « less
-
This work considers the minimization of a general convex function f (X) over the cone of positive semi-definite matrices whose optimal solution X* is of low-rank. Standard first-order convex solvers require performing an eigenvalue decomposition in each iteration, severely limiting their scalability. A natural nonconvex reformulation of the problem factors the variable X into the product of a rectangular matrix with fewer columns and its transpose. For a special class of matrix sensing and completion problems with quadratic objective functions, local search algorithms applied to the factored problem have been shown to be much more efficient and, in spite of being nonconvex, to converge to the global optimum. The purpose of this work is to extend this line of study to general convex objective functions f (X) and investigate the geometry of the resulting factored formulations. Specifically, we prove that when f (X) satisfies the restricted well-conditioned assumption, each critical point of the factored problem either corresponds to the optimal solution X* or a strict saddle where the Hessian matrix has a strictly negative eigenvalue. Such a geometric structure of the factored formulation ensures that many local search algorithms can converge to the global optimum with random initializations.more » « less
-
We study the problem of certification: given queries to a function f : {0,1}n → {0,1} with certificate complexity ≤ k and an input x⋆, output a size-k certificate for f’s value on x⋆. For monotone functions, a classic local search algorithm of Angluin accomplishes this task with n queries, which we show is optimal for local search algorithms. Our main result is a new algorithm for certifying monotone functions with O(k8 logn) queries, which comes close to matching the information-theoretic lower bound of Ω(k logn). The design and analysis of our algorithm are based on a new connection to threshold phenomena in monotone functions. We further prove exponential-in-k lower bounds when f is non-monotone, and when f is monotone but the algorithm is only given random examples of f. These lower bounds show that assumptions on the structure of f and query access to it are both necessary for the polynomial dependence on k that we achieve.more » « less
-
Consider the linear transport equation in 1D under an external confining potential
:\begin{document}$ \Phi $\end{document} \begin{document}$ \begin{equation*} {\partial}_t f + v {\partial}_x f - {\partial}_x \Phi {\partial}_v f = 0. \end{equation*} $\end{document} For
(with\begin{document}$ \Phi = \frac {x^2}2 + \frac { \varepsilon x^4}2 $\end{document} small), we prove phase mixing and quantitative decay estimates for\begin{document}$ \varepsilon >0 $\end{document} , with an inverse polynomial decay rate\begin{document}$ {\partial}_t \varphi : = - \Delta^{-1} \int_{ \mathbb{R}} {\partial}_t f \, \mathrm{d} v $\end{document} . In the proof, we develop a commuting vector field approach, suitably adapted to this setting. We will explain why we hope this is relevant for the nonlinear stability of the zero solution for the Vlasov–Poisson system in\begin{document}$ O({\langle} t{\rangle}^{-2}) $\end{document} D under the external potential\begin{document}$ 1 $\end{document} .\begin{document}$ \Phi $\end{document}