Abstract When k and s are natural numbers and $${\mathbf h}\in {\mathbb Z}^k$$, denote by $$J_{s,k}(X;\,{\mathbf h})$$ the number of integral solutions of the system $$ \sum_{i=1}^s(x_i^j-y_i^j)=h_j\quad (1\leqslant j\leqslant k), $$ with $$1\leqslant x_i,y_i\leqslant X$$. When $$s\lt k(k+1)/2$$ and $$(h_1,\ldots ,h_{k-1})\ne {\mathbf 0}$$, Brandes and Hughes have shown that $$J_{s,k}(X;\,{\mathbf h})=o(X^s)$$. In this paper we improve on quantitative aspects of this result, and, subject to an extension of the main conjecture in Vinogradov’s mean value theorem, we obtain an asymptotic formula for $$J_{s,k}(X;\,{\mathbf h})$$ in the critical case $s=k(k+1)/2$. The latter requires minor arc estimates going beyond square-root cancellation.
more »
« less
Uncoupled isotonic regression via minimum Wasserstein deconvolution
Abstract Isotonic regression is a standard problem in shape-constrained estimation where the goal is to estimate an unknown non-decreasing regression function $$f$$ from independent pairs $$(x_i, y_i)$$ where $${\mathbb{E}}[y_i]=f(x_i), i=1, \ldots n$$. While this problem is well understood both statistically and computationally, much less is known about its uncoupled counterpart, where one is given only the unordered sets $$\{x_1, \ldots , x_n\}$$ and $$\{y_1, \ldots , y_n\}$$. In this work, we leverage tools from optimal transport theory to derive minimax rates under weak moments conditions on $$y_i$$ and to give an efficient algorithm achieving optimal rates. Both upper and lower bounds employ moment-matching arguments that are also pertinent to learning mixtures of distributions and deconvolution.
more »
« less
- PAR ID:
- 10128161
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Information and Inference: A Journal of the IMA
- Volume:
- 8
- Issue:
- 4
- ISSN:
- 2049-8772
- Page Range / eLocation ID:
- p. 691-717
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Chan, Timothy; Fischer, Johannes; Iacono, John; Herman, Grzegorz (Ed.)We study the problem of robust multivariate polynomial regression: let p: ℝⁿ → ℝ be an unknown n-variate polynomial of degree at most d in each variable. We are given as input a set of random samples (𝐱_i,y_i) ∈ [-1,1]ⁿ × ℝ that are noisy versions of (𝐱_i,p(𝐱_i)). More precisely, each 𝐱_i is sampled independently from some distribution χ on [-1,1]ⁿ, and for each i independently, y_i is arbitrary (i.e., an outlier) with probability at most ρ < 1/2, and otherwise satisfies |y_i-p(𝐱_i)| ≤ σ. The goal is to output a polynomial p̂, of degree at most d in each variable, within an 𝓁_∞-distance of at most O(σ) from p. Kane, Karmalkar, and Price [FOCS'17] solved this problem for n = 1. We generalize their results to the n-variate setting, showing an algorithm that achieves a sample complexity of O_n(dⁿlog d), where the hidden constant depends on n, if χ is the n-dimensional Chebyshev distribution. The sample complexity is O_n(d^{2n}log d), if the samples are drawn from the uniform distribution instead. The approximation error is guaranteed to be at most O(σ), and the run-time depends on log(1/σ). In the setting where each 𝐱_i and y_i are known up to N bits of precision, the run-time’s dependence on N is linear. We also show that our sample complexities are optimal in terms of dⁿ. Furthermore, we show that it is possible to have the run-time be independent of 1/σ, at the cost of a higher sample complexity.more » « less
-
We study the problem of robust multivariate polynomial regression: let p\colon\mathbb{R}^n\to\mathbb{R} be an unknown n-variate polynomial of degree at most d in each variable. We are given as input a set of random samples (\mathbf{x}_i,y_i) \in [-1,1]^n \times \mathbb{R} that are noisy versions of (\mathbf{x}_i,p(\mathbf{x}_i)). More precisely, each \mathbf{x}_i is sampled independently from some distribution \chi on [-1,1]^n, and for each i independently, y_i is arbitrary (i.e., an outlier) with probability at most \rho < 1/2, and otherwise satisfies |y_i-p(\mathbf{x}_i)|\leq\sigma. The goal is to output a polynomial \hat{p}, of degree at most d in each variable, within an \ell_\infty-distance of at most O(\sigma) from p. Kane, Karmalkar, and Price [FOCS'17] solved this problem for n=1. We generalize their results to the n-variate setting, showing an algorithm that achieves a sample complexity of O_n(d^n\log d), where the hidden constant depends on n, if \chi is the n-dimensional Chebyshev distribution. The sample complexity is O_n(d^{2n}\log d), if the samples are drawn from the uniform distribution instead. The approximation error is guaranteed to be at most O(\sigma), and the run-time depends on \log(1/\sigma). In the setting where each \mathbf{x}_i and y_i are known up to N bits of precision, the run-time's dependence on N is linear. We also show that our sample complexities are optimal in terms of d^n. Furthermore, we show that it is possible to have the run-time be independent of 1/\sigma, at the cost of a higher sample complexity.more » « less
-
Abstract In the (special) smoothing spline problem one considers a variational problem with a quadratic data fidelity penalty and Laplacian regularization. Higher order regularity can be obtained via replacing the Laplacian regulariser with a poly-Laplacian regulariser. The methodology is readily adapted to graphs and here we consider graph poly-Laplacian regularization in a fully supervised, non-parametric, noise corrupted, regression problem. In particular, given a dataset$$\{x_i\}_{i=1}^n$$ and a set of noisy labels$$\{y_i\}_{i=1}^n\subset \mathbb {R}$$ we let$$u_n{:}\{x_i\}_{i=1}^n\rightarrow \mathbb {R}$$ be the minimizer of an energy which consists of a data fidelity term and an appropriately scaled graph poly-Laplacian term. When$$y_i = g(x_i)+\xi _i$$ , for iid noise$$\xi _i$$ , and using the geometric random graph, we identify (with high probability) the rate of convergence of$$u_n$$ togin the large data limit$$n\rightarrow \infty $$ . Furthermore, our rate is close to the known rate of convergence in the usual smoothing spline model.more » « less
-
We first introduce and study the notion of multi-weighted blow-ups, which is later used to systematically construct an explicit yet efficient algorithm for functorial logarithmic resolution in characteristic zero, in the sense of Hironaka. Specifically, for a singular, reduced closed subscheme $$X$$ of a smooth scheme $$Y$$ over a field of characteristic zero, we resolve the singularities of $$X$$ by taking proper transforms $$X_i \subset Y_i$$ along a sequence of multi-weighted blow-ups $$Y_N \to Y_{N-1} \to \dotsb \to Y_0 = Y$$ which satisfies the following properties: (i) the $$Y_i$$ are smooth Artin stacks with simple normal crossing exceptional loci; (ii) at each step we always blow up the worst singular locus of $$X_i$$, and witness on $$X_{i+1}$$ an immediate improvement in singularities; (iii) and finally, the singular locus of $$X$$ is transformed into a simple normal crossing divisor on $$X_N$$. Comment: Final published versionmore » « less
An official website of the United States government
