skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Online Learning with Primary and Secondary Losses
We study the problem of online learning with primary and secondary losses. For example, a recruiter making decisions of which job applicants to hire might weigh false positives and false negatives equally (the primary loss) but the applicants might weigh false negatives much higher (the secondary loss). We consider the following question: Can we combine "expert advice" to achieve low regret with respect to the primary loss, while at the same time performing {\em not much worse than the worst expert} with respect to the secondary loss? Unfortunately, we show that this goal is unachievable without any bounded variance assumption on the secondary loss. More generally, we consider the goal of minimizing the regret with respect to the primary loss and bounding the secondary loss by a linear threshold. On the positive side, we show that running any switching-limited algorithm can achieve this goal if all experts satisfy the assumption that the secondary loss does not exceed the linear threshold by o(T) for any time interval. If not all experts satisfy this assumption, our algorithms can achieve this goal given access to some external oracles which determine when to deactivate and reactivate experts.  more » « less
Award ID(s):
1815011
PAR ID:
10283159
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Advances in neural information processing systems
Volume:
33
ISSN:
1049-5258
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We study a generalization of the online binary prediction with expert advice framework where at each round, the learner is allowed to pick m   1 experts from a pool of K experts and the overall utility is a modular or submodular function of the chosen experts. We focus on the setting in which experts act strategically and aim to maximize their influence on the algorithm’s predictions by potentially misreporting their beliefs about the events. Among others, this setting finds applications in forecasting competitions where the learner seeks not only to make predictions by aggregating different forecasters but also to rank them according to their relative performance. Our goal is to design algorithms that satisfy the following two requirements: 1) Incentive-compatible: Incentivize the experts to report their beliefs truthfully, and 2) No-regret: Achieve sublinear regret with respect to the true beliefs of the best-fixed set of m experts in hindsight. Prior works have studied this framework when m = 1 and provided incentive-compatible no-regret algorithms for the problem. We first show that a simple reduction of our problem to the m = 1 setting is neither efficient nor effective. Then, we provide algorithms that utilize the specific structure of the utility functions to achieve the two desired goals. 
    more » « less
  2. Chiappa, Silvia; Calandra, Roberto (Ed.)
    This paper considers a variant of the classical online learning problem with expert predictions. Our model’s differences and challenges are due to lacking any direct feedback on the loss each expert incurs at each time step $$t$$. We propose an approach that uses peer prediction and identify conditions where it succeeds. Our techniques revolve around a carefully designed peer score function $s()$$ that scores experts’ predictions based on the peer consensus. We show a sufficient condition, that we call \emph{peer calibration}, under which standard online learning algorithms using loss feedback computed by the carefully crafted $$s()$ have bounded regret with respect to the unrevealed ground truth values. We then demonstrate how suitable $s()$ functions can be derived for different assumptions and models. 
    more » « less
  3. In this paper, we study a class of online optimization problems with long-term budget constraints where the objective functions are not necessarily concave (nor convex), but they instead satisfy the Diminishing Returns (DR) property. In this online setting, a sequence of monotone DR-submodular objective functions and linear budget functions arrive over time and assuming a limited total budget, the goal is to take actions at each time, before observing the utility and budget function arriving at that round, to achieve sub-linear regret bound while the total budget violation is sub-linear as well. Prior work has shown that achieving sub-linear regret and total budget violation simultaneously is impossible if the utility and budget functions are chosen adversarially. Therefore, we modify the notion of regret by comparing the agent against the best fixed decision in hindsight which satisfies the budget constraint proportionally over any window of length W. We propose the Online Saddle Point Hybrid Gradient (OSPHG) algorithm to solve this class of online problems. For W = T, we recover the aforementioned impossibility result. However, if W is sublinear in T, we show that it is possible to obtain sub-linear bounds for both the regret and the total budget violation. 
    more » « less
  4. Dasgupta, S.; Haghtalab, N. (Ed.)
    We consider a lifelong learning scenario in which a learner faces a neverending and arbitrary stream of facts and has to decide which ones to retain in its limited memory. We introduce a mathematical model based on the online learning framework, in which the learner measures itself against a collection of experts that are also memory-constrained and that reflect different policies for what to remember. Interspersed with the stream of facts are occasional questions, and on each of these the learner incurs a loss if it has not remembered the corresponding fact. Its goal is to do almost as well as the best expert in hindsight, while using roughly the same amount of memory. We identify difficulties with using the multiplicative weights update algorithm in this memory-constrained scenario, and design an alternative scheme whose regret guarantees are close to the best possible. 
    more » « less
  5. This paper studies privacy in the context of decision-support queries that classify objects as either true or false based on whether they satisfy the query. Mechanisms to ensure privacy may result in false positives and false negatives. In decision-support applications, often, false negatives have to remain bounded. Existing accuracy-aware privacy preserving techniques cannot directly be used to support such an accuracy requirement and their naive adaptations to support bounded accuracy of false negatives results in significant privacy loss depending upon distribution of data. This paper explores the concept of minimally-invasive data exploration for decision support that attempts to minimize privacy loss while supporting bounded guarantee on false negatives by adaptively adjusting privacy based on data distribution. Our experimental results show that the MIDE algorithms perform well and are robust over variations in data distributions. 
    more » « less