skip to main content


Title: Adaptive Online Estimation of Piecewise Polynomial Trends
We consider the framework of non-stationary stochastic optimization (Besbes et al., 2015) with squared error losses and noisy gradient feedback where the dynamic regret of an online learner against a time varying comparator sequence is studied. Motivated from the theory of non-parametric regression, we introduce a new variational constraint that enforces the comparator sequence to belong to a discrete k^{th} order Total Variation ball of radius C_n. This variational constraint models comparators that have piece-wise polynomial structure which has many relevant practical applications (Tibshirani, 2014). By establishing connections to the theory of wavelet based non-parametric regression, we design a polynomial time algorithm that achieves the nearly optimal dynamic regret of ~O(n^{1/(2k+3)} C_n^{2/(2k+3)}). The proposed policy is adaptive to the unknown radius C_n. Further, we show that the same policy is minimax optimal for several other non-parametric families of interest.  more » « less
Award ID(s):
2029626
NSF-PAR ID:
10232804
Author(s) / Creator(s):
;
Date Published:
Journal Name:
NeurIPS 2020
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Ruiz, Francisco and (Ed.)
    We consider the problem of universal dynamic regret minimization under exp-concave and smooth losses. We show that appropriately designed Strongly Adaptive algorithms achieve a dynamic regret of $\tilde O(d^2 n^{1/5} [\mathcal{TV}_1(w_{1:n})]^{2/5} \vee d^2)$, where $n$ is the time horizon and $\mathcal{TV}_1(w_{1:n})$ a path variational based on second order differences of the comparator sequence. Such a path variational naturally encodes comparator sequences that are piece-wise linear โ€“ a powerful family that tracks a variety of non-stationarity patterns in practice (Kim et al., 2009). The aforementioned dynamic regret is shown to be optimal modulo dimension dependencies and poly-logarithmic factors of $n$. To the best of our knowledge, this path variational has not been studied in the non-stochastic online learning literature before. Our proof techniques rely on analysing the KKT conditions of the offline oracle and requires several non-trivial generalizations of the ideas in Baby and Wang (2021) where the latter work only implies an $\tilde{O}(n^{1/3})$ regret for the current problem. 
    more » « less
  2. We study the framework of universal dynamic regret minimization with strongly convex losses. We answer an open problem in Baby and Wang 2021 by showing that in a proper learning setup, Strongly Adaptive algorithms can achieve the near optimal dynamic regret of ๐‘‚ฬƒ (๐‘‘1/3๐‘›1/3TV[๐‘ข1:๐‘›]2/3โˆจ๐‘‘) against any comparator sequence ๐‘ข1,โ€ฆ,๐‘ข๐‘› simultaneously, where ๐‘› is the time horizon and TV[๐‘ข1:๐‘›] is the Total Variation of comparator. These results are facilitated by exploiting a number of new structures imposed by the KKT conditions that were not considered in Baby and Wang 2021 which also lead to other improvements over their results such as: (a) handling non-smooth losses and (b) improving the dimension dependence on regret. Further, we also derive near optimal dynamic regret rates for the special case of proper online learning with exp-concave losses and an ๐ฟโˆž constrained decision set. 
    more » « less
  3. afe reinforcement learning (RL) aims to learn policies that satisfy certain constraints before deploying them to safety-critical applications. Previous primal-dual style approaches suffer from instability issues and lack optimality guarantees. This paper overcomes the issues from the perspective of probabilistic inference. We introduce a novel Expectation-Maximization approach to naturally incorporate constraints during the policy learning: 1) a provable optimal non-parametric variational distribution could be computed in closed form after a convex optimization (E-step); 2) the policy parameter is improved within the trust region based on the optimal variational distribution (M-step). The proposed algorithm decomposes the safe RL problem into a convex optimization phase and a supervised learning phase, which yields a more stable training performance. A wide range of experiments on continuous robotic tasks shows that the proposed method achieves significantly better constraint satisfaction performance and better sample efficiency than baselines. The code is available at https://github.com/liuzuxin/cvpo-safe-rl. 
    more » « less
  4. null (Ed.)
    The dynamic stall phenomenon produces adverse aerodynamic loading, which negatively affects the structural strength and life of aerodynamic systems. Aerodynamic shape optimization (ASO) provides a practical approach for delaying and mitigating dynamic stall characteristics without the addition of an auxiliary system. A typical ASO investigation requires multiple evaluations of accurate but time-consuming computational fluid dynamics (CFD) simulations. In the case of dynamic stall, unsteady CFD simulations are required for airfoil shape evaluation; combining it with high-dimensions of airfoil shape parameterization renders the ASO investigation computationally costly. In this study, metamodel-based optimization (MBO) is proposed using the multifidelity modeling (MFM) technique to efficiently conduct ASO investigation for computationally expensive dynamic stall cases. MFM methods combine data from accurate high-fidelity (HF) simulations and fast low-fidelity (LF) simulations to provide accurate and fast predictions. In particular, Cokriging regression is used for approximating the objective and constraint functions. The airfoil shape is parameterized using six PARSEC parameters. The objective and constraint functions are evaluated for a sinusoidally oscillating airfoil with the unsteady Reynolds-averaged Navier-Stokes equations at a Reynolds number of 135,000, Mach number of 0.1, and reduced frequency of 0.05. The initial metamodel is generated using 220 LF and 20 HF samples. The metamodel is then sequentially refined using the expected improvement infill criteria and validated with the normalized root mean square error. The refined metamodel is utilized for finding the optimal design. The optimal airfoil shape shows higher thickness, larger leading-edge radius, and an aft camber compared to baseline (NACA 0012). The optimal shape delays the dynamic stall occurrence by 3 degrees and reduces the peak aerodynamic coefficients. The performance of the MFM method is also compared with the single-fidelity metamodeling method using HF samples. Both the approaches produced similar optimal shapes; however, the optimal shape from MFM achieved a minimum objective function value while more closely satisfying the constraint at a computational cost saving of around 41%. 
    more » « less
  5. Ahn, Hee-Kap ; Sadakane, Kunihiko (Ed.)
    In the standard planar k-center clustering problem, one is given a set P of n points in the plane, and the goal is to select k center points, so as to minimize the maximum distance over points in P to their nearest center. Here we initiate the systematic study of the clustering with neighborhoods problem, which generalizes the k-center problem to allow the covered objects to be a set of general disjoint convex objects C rather than just a point set P. For this problem we first show that there is a PTAS for approximating the number of centers. Specifically, if r_opt is the optimal radius for k centers, then in n^O(1/ฮตยฒ) time we can produce a set of (1+ฮต)k centers with radius โ‰ค r_opt. If instead one considers the standard goal of approximating the optimal clustering radius, while keeping k as a hard constraint, we show that the radius cannot be approximated within any factor in polynomial time unless P = NP, even when C is a set of line segments. When C is a set of unit disks we show the problem is hard to approximate within a factor of (โˆš{13}-โˆš3)(2-โˆš3) โ‰ˆ 6.99. This hardness result complements our main result, where we show that when the objects are disks, of possibly differing radii, there is a (5+2โˆš3)โ‰ˆ 8.46 approximation algorithm. Additionally, for unit disks we give an O(n log k)+(k/ฮต)^O(k) time (1+ฮต)-approximation to the optimal radius, that is, an FPTAS for constant k whose running time depends only linearly on n. Finally, we show that the one dimensional version of the problem, even when intersections are allowed, can be solved exactly in O(n log n) time. 
    more » « less