skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: An Optimal Reduction of TV-Denoising to Adaptive Online Learning
We consider the problem of estimating a function from n noisy samples whose discrete Total Variation (TV) is bounded by C_n. We reveal a deep connection to the seemingly disparate problem of Strongly Adaptive online learning (Daniely et al., 2015) and provide an O(n log n) time algorithm that attains the near minimax optimal rate of ~O (n^(1/3)C_n^(2/3) under squared error loss. The resulting algorithm runs online and optimally adapts to the unknown smoothness parameter Cn. This leads to a new and more versatile alternative to wavelets-based methods for (1) adaptively estimating TV bounded functions; (2) online forecasting of TV bounded trends in time series.  more » « less
Award ID(s):
2029626
PAR ID:
10232802
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We study the problem of efficiently estimating the effect of an intervention on a single variable using observational samples. Our goal is to give algorithms with polynomial time and sample complexity in a non-parametric setting. Tian and Pearl (AAAI ’02) have exactly characterized the class of causal graphs for which causal effects of atomic interventions can be identified from observational data. We make their result quantitative. Suppose 𝒫 is a causal model on a set V of n observable variables with respect to a given causal graph G, and let do(x) be an identifiable intervention on a variable X. We show that assuming that G has bounded in-degree and bounded c-components (k) and that the observational distribution satisfies a strong positivity condition: (i) [Evaluation] There is an algorithm that outputs with probability 2/3 an evaluator for a distribution P^ that satisfies TV(P(V | do(x)), P^(V)) < eps using m=O (n/eps^2) samples from P and O(mn) time. The evaluator can return in O(n) time the probability P^(v) for any assignment v to V. (ii) [Sampling] There is an algorithm that outputs with probability 2/3 a sampler for a distribution P^ that satisfies TV(P(V | do(x)), P^(V)) < eps using m=O (n/eps^2) samples from P and O(mn) time. The sampler returns an iid sample from P^ with probability 1 in O(n) time. We extend our techniques to estimate P(Y | do(x)) for a subset Y of variables of interest. We also show lower bounds for the sample complexity, demonstrating that our sample complexity has optimal dependence on the parameters n and eps, as well as if k=1 on the strong positivity parameter. 
    more » « less
  2. null (Ed.)
    We consider the framework of non-stationary stochastic optimization (Besbes et al., 2015) with squared error losses and noisy gradient feedback where the dynamic regret of an online learner against a time varying comparator sequence is studied. Motivated from the theory of non-parametric regression, we introduce a new variational constraint that enforces the comparator sequence to belong to a discrete k^{th} order Total Variation ball of radius C_n. This variational constraint models comparators that have piece-wise polynomial structure which has many relevant practical applications (Tibshirani, 2014). By establishing connections to the theory of wavelet based non-parametric regression, we design a polynomial time algorithm that achieves the nearly optimal dynamic regret of ~O(n^{1/(2k+3)} C_n^{2/(2k+3)}). The proposed policy is adaptive to the unknown radius C_n. Further, we show that the same policy is minimax optimal for several other non-parametric families of interest. 
    more » « less
  3. Ruiz, Francisco and (Ed.)
    We consider the problem of universal dynamic regret minimization under exp-concave and smooth losses. We show that appropriately designed Strongly Adaptive algorithms achieve a dynamic regret of $$\tilde O(d^2 n^{1/5} [\mathcal{TV}_1(w_{1:n})]^{2/5} \vee d^2)$$, where $$n$$ is the time horizon and $$\mathcal{TV}_1(w_{1:n})$$ a path variational based on second order differences of the comparator sequence. Such a path variational naturally encodes comparator sequences that are piece-wise linear – a powerful family that tracks a variety of non-stationarity patterns in practice (Kim et al., 2009). The aforementioned dynamic regret is shown to be optimal modulo dimension dependencies and poly-logarithmic factors of $$n$$. To the best of our knowledge, this path variational has not been studied in the non-stochastic online learning literature before. Our proof techniques rely on analysing the KKT conditions of the offline oracle and requires several non-trivial generalizations of the ideas in Baby and Wang (2021) where the latter work only implies an $$\tilde{O}(n^{1/3})$$ regret for the current problem. 
    more » « less
  4. We present the first near-linear-time algorithm that computes a (1+ε)-approximation of the diameter of a weighted unit-disk graph of n vertices. Our algorithm requires O(n log^2 n) time for any constant ε>0, so we considerably improve upon the near-O(n^{3/2})-time algorithm of Gao and Zhang (2005). Using similar ideas we develop (1+ε)-approximate \emph{distance oracles} of O(1) query time with a likewise improvement in the preprocessing time, specifically from near O(n^{3/2}) to O(n log^3 n). We also obtain similar new results for a number of related problems in the weighted unit-disk graph metric such as the radius and the bichromatic closest pair. As a further application we employ our distance oracle, along with additional ideas, to solve the (1+ε)-approximate \emph{all-pairs bounded-leg shortest paths\/} (apBLSP) problem for a set of n planar points. Our data structure requires O(n^2 log n) space, O(loglog n) query time, and nearly O(n^{2.579}) preprocessing time for any constant ε>0, and is the first that breaks the near-cubic preprocessing time bound given by Roditty and Segal (2011). 
    more » « less
  5. Given a set $$P$$ of $$n$$ points in the plane, we consider the problem of computing the number of points of $$P$$ in a query unit disk (i.e., all query disks have the same radius). We show that the main techniques for simplex range searching in the plane can be adapted to this problem. For example, by adapting Matoušek's results, we can build a data structure of $O(n)$ space in $$O(n^{1+\delta})$$ time (for any $$\delta>0$$) so that each query can be answered in $$O(\sqrt{n})$$ time; alternatively, we can build a data structure of $$O(n^2/\log^2 n)$$ space with $$O(n^{1+\delta})$$ preprocessing time (for any $$\delta>0$$) and $$O(\log n)$$ query time. Our techniques lead to improvements for several other classical problems in computational geometry. 1. Given a set of $$n$$ unit disks and a set of $$n$$ points in the plane, the batched unit-disk range counting problem is to compute for each disk the number of points in it. Previous work [Katz and Sharir, 1997] solved the problem in $$O(n^{4/3}\log n)$$ time. We give a new algorithm of $$O(n^{4/3})$$ time, which is optimal as it matches an $$\Omega(n^{4/3})$$-time lower bound. For small $$\chi$$, where $$\chi$$ is the number of pairs of unit disks that intersect, we further improve the algorithm to $$O(n^{2/3}\chi^{1/3}+n^{1+\delta})$$ time, for any $$\delta>0$$. 2. The above result immediately leads to an $$O(n^{4/3})$$ time optimal algorithm for counting the intersecting pairs of circles for a set of $$n$$ unit circles in the plane. The previous best algorithms solve the problem in $$O(n^{4/3}\log n)$$ deterministic time [Katz and Sharir, 1997] or in $$O(n^{4/3}\log^{2/3} n)$$ expected time by a randomized algorithm [Agarwal, Pellegrini, and Sharir, 1993]. 3. Given a set $$P$$ of $$n$$ points in the plane and an integer $$k$$, the distance selection problem is to find the $$k$$-th smallest distance among all pairwise distances of $$P$$. The problem can be solved in $$O(n^{4/3}\log^2 n)$$ deterministic time [Katz and Sharir, 1997] or in $$O(n\log n+n^{2/3}k^{1/3}\log^{5/3}n)$$ expected time by a randomized algorithm [Chan, 2001]. Our new randomized algorithm runs in $$O(n\log n +n^{2/3}k^{1/3}\log n)$$ expected time. 4. Given a set $$P$$ of $$n$$ points in the plane, the discrete $$2$$-center problem is to compute two smallest congruent disks whose centers are in $$P$$ and whose union covers $$P$$. An $$O(n^{4/3}\log^5 n)$$-time algorithm was known [Agarwal, Sharir, and Welzl, 1998]. Our techniques yield a deterministic algorithm of $$O(n^{4/3}\log^{10/3} n\cdot (\log\log n)^{O(1)})$$ time and a randomized algorithm of $$O(n^{4/3}\log^3 n\cdot (\log\log n)^{1/3})$$ expected time. 
    more » « less