In lifelong learning, tasks (or classes) to be learned arrive sequentially over time in arbitrary order. During training, knowledge from previous tasks can be captured and transferred to subsequent ones to improve sample efficiency. We consider the setting where all target tasks can be represented in the span of a small number of unknown linear or nonlinear features of the input data. We propose a lifelong learning algorithm that maintains and refines the internal feature representation. We prove that for any desired accuracy on all tasks, the dimension of the representation remains close to that of the underlying representation. The resulting sample complexity improves significantly on existing bounds. In the setting of linear features, our algorithm is provably efficient and the sample complexity for input dimension d, m tasks with k features up to error ϵ is O~(dk1.5/ϵ+km/ϵ). We also prove a matching lower bound for any lifelong learning algorithm that uses a single task learner as a black box. We complement our analysis with an empirical study, including a heuristic lifelong learning algorithm for deep neural networks. Our method performs favorably on challenging realistic image datasets compared to state-of-the-art continual learning methods.
more »
« less
On the Complexity of Simulating Auxiliary Input
We construct a simulator for the simulating auxiliary input problem with complexity better than all previous results and prove the optimality up to logarithmic factors by establishing a black-box lower bound. Specifically, let ℓ be the length of the auxiliary input and ϵ be the indistinguishability parameter. Our simulator is O~(2^ℓ/ϵ^2) more complicated than the distinguisher family. For the lower bound, we show the relative complexity to the distinguisher of a simulator is at least Ω(2^ℓ/ϵ^2) assuming the simulator is restricted to use the distinguishers in a black-box way and satisfy a mild restriction.
more »
« less
- Award ID(s):
- 1749750
- PAR ID:
- 10081826
- Date Published:
- Journal Name:
- Advances in Cryptology – EUROCRYPT 2018
- Volume:
- 10822
- Page Range / eLocation ID:
- 371-390
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We study smoothed analysis of distributed graph algorithms, focusing on the fundamental minimum spanning tree (MST) problem. With the goal of studying the time complexity of distributed MST as a function of the "perturbation" of the input graph, we posit a smoothing model that is parameterized by a smoothing parameter 0 ≤ ϵ(n) ≤ 1 which controls the amount of random edges that can be added to an input graph G per round. Informally, ϵ(n) is the probability (typically a small function of n, e.g., n--¼) that a random edge can be added to a node per round. The added random edges, once they are added, can be used (only) for communication. We show upper and lower bounds on the time complexity of distributed MST in the above smoothing model. We present a distributed algorithm that, with high probability, 1 computes an MST and runs in Õ(min{1/√ϵ(n)2O(√log n), D+ √n}) rounds2 where ϵ is the smoothing parameter, D is the network diameter and n is the network size. To complement our upper bound, we also show a lower bound of Ω(min{1/√ϵ(n), D + √n}). We note that the upper and lower bounds essentially match except for a multiplicative 2O(√log n) polylog(n) factor. Our work can be considered as a first step in understanding the smoothed complexity of distributed graph algorithms.more » « less
-
We study the impact of nonconvexity on the complexity of nonsmooth optimization, emphasizing objectives such as piecewise linear functions, which may not be weakly convex. We focus on a dimension-independent analysis, slightly modifying a 2020 black-box algorithm of Zhang-Lin-Jegelka-Sra-Jadbabaie that approximates an ϵ-stationary point of any directionally differentiable Lipschitz objective using [Formula: see text] calls to a specialized subgradient oracle and a randomized line search. Seeking by contrast a deterministic method, we present a simple black-box version that achieves [Formula: see text] for any difference-of-convex objective and [Formula: see text] for the weakly convex case. Our complexity bound depends on a natural nonconvexity modulus that is related, intriguingly, to the negative part of directional second derivatives of the objective, understood in the distributional sense. Funding: This work was supported by the National Science Foundation [Grant DMS-2006990].more » « less
-
SPRINGER (Ed.)In this work we study the problem of minimizing the round complexity for securely evaluating multiparty functionalities while making black-box use of polynomial time assumptions. In Eurocrypt 2016, Garg et al. showed that assuming all parties have access to a broadcast channel, then at least four rounds of communication are required to securely realize non-trivial functionalities in the plain model. A sequence of works follow-up the result of Garg et al. matching this lower bound under a variety of assumptions. Unfortunately, none of these works make black-box use of the underlying cryptographic primitives. In Crypto 2021, Ishai, Khurana, Sahai, and Srinivasan came closer to matching the four-round lower bound, obtaining a five-round protocol that makes black-box use of oblivious transfer and PKE with pseudorandom public keys. In this work, we show how to realize any input-less functionality (e.g., coin-tossing, generation of key-pairs, and so on) in four rounds while making black-box use of two-round oblivious transfer. As an additional result, we construct the first four-round MPC protocol for generic functionalities that makes black-box use of the underlying primitives, achieving security against non-aborting adversaries. Our protocols are based on a new primitive called list two-party computation. This primitive offers relaxed security compared to the standard notion of secure two-party computation. Despite this relaxation, we argue that this tool suffices for our applications. List two-party computation is of independent interest, as we argue it can also be used for the generation of setups, like oblivious transfer correlated randomness, in three rounds. Prior to our work, generating such a setup required at least four rounds of interactions or a trusted third party.more » « less
-
null (Ed.)Abstract We study minimization of a structured objective function, being the sum of a smooth function and a composition of a weakly convex function with a linear operator. Applications include image reconstruction problems with regularizers that introduce less bias than the standard convex regularizers. We develop a variable smoothing algorithm, based on the Moreau envelope with a decreasing sequence of smoothing parameters, and prove a complexity of $${\mathcal {O}}(\epsilon ^{-3})$$ O ( ϵ - 3 ) to achieve an $$\epsilon $$ ϵ -approximate solution. This bound interpolates between the $${\mathcal {O}}(\epsilon ^{-2})$$ O ( ϵ - 2 ) bound for the smooth case and the $${\mathcal {O}}(\epsilon ^{-4})$$ O ( ϵ - 4 ) bound for the subgradient method. Our complexity bound is in line with other works that deal with structured nonsmoothness of weakly convex functions.more » « less
An official website of the United States government

