skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Non-linear Welfare-Aware Strategic Learning
This paper studies algorithmic decision-making in the presence of strategic individual behaviors, where an ML model is used to make decisions about human agents and the latter can adapt their behavior strategically to improve their future data. Existing results on strategic learning have largely focused on the linear setting where agents with linear labeling functions best respond to a (noisy) linear decision policy. Instead, this work focuses on general non-linear settings where agents respond to the decision policy with only "local information" of the policy. Moreover, we simultaneously consider the objectives of maximizing decision-maker welfare (model prediction accuracy), social welfare (agent improvement caused by strategic behaviors), and agent welfare (the extent that ML underestimates the agents). We first generalize the agent best response model in previous works to the non-linear setting, then reveal the compatibility of welfare objectives. We show the three welfare can attain the optimum simultaneously only under restrictive conditions which are challenging to achieve in non-linear settings. The theoretical results imply that existing works solely maximizing the welfare of a subset of parties inevitably diminish the welfare of the others. We thus claim the necessity of balancing the welfare of each party in non-linear settings and propose an irreducible optimization algorithm suitable for general strategic learning. Experiments on synthetic and real data validate the proposed algorithm.  more » « less
Award ID(s):
2202699
PAR ID:
10534978
Author(s) / Creator(s):
;
Publisher / Repository:
7th AAAI Conference on AI, Ethics, and Society
Date Published:
Format(s):
Medium: X
Location:
7th AAAI Conference on AI, Ethics, and Society
Sponsoring Org:
National Science Foundation
More Like this
  1. We study the problem of online binary classification where strategic agents can manipulate their observable features in predefined ways, modeled by a manipulation graph, in order to receive a positive classification. We show this setting differs in fundamental ways from classic (non-strategic) online classification. For instance, whereas in the non-strategic case, a mistake bound of ln |H| is achievable via the halving algorithm when the target function belongs to a known class H, we show that no deterministic algorithm can achieve a mistake bound o(Δ) in the strategic setting, where Δ is the maximum degree of the manipulation graph (even when |H| = O(Δ)). We complement this with a general algorithm achieving mistake bound O(Δ ln |H|). We also extend this to the agnostic setting, and show that this algorithm achieves a Δ multiplicative regret (mistake bound of O(Δ · OPT + Δ · ln |H|)), and that no deterministic algorithm can achieve o(Δ) multiplicative regret. Next, we study two randomized models based on whether the random choices are made before or after agents respond, and show they exhibit fundamental differences. In the first, fractional model, at each round the learner deterministically chooses a probability distribution over classifiers inducing expected values on each vertex (probabilities of being classified as positive), which the strategic agents respond to. We show that any learner in this model has to suffer linear regret. On the other hand, in the second randomized algorithms model, while the adversary who selects the next agent must respond to the learner's probability distribution over classifiers, the agent then responds to the actual hypothesis classifier drawn from this distribution. Surprisingly, we show this model is more advantageous to the learner, and we design randomized algorithms that achieve sublinear regret bounds against both oblivious and adaptive adversaries. 
    more » « less
  2. This paper studies algorithmic decision-making under human's strategic behavior, where a decision maker uses an algorithm to make decisions about human agents, and the latter with information about the algorithm may exert effort strategically and improve to receive favorable decisions. Unlike prior works that assume agents benefit from their efforts immediately, we consider realistic scenarios where the impacts of these efforts are persistent and agents benefit from efforts by making improvements gradually. We first develop a dynamic model to characterize persistent improvements and based on this construct a Stackelberg game to model the interplay between agents and the decision-maker. We analytically characterize the equilibrium strategies and identify conditions under which agents have incentives to improve. With the dynamics, we then study how the decision-maker can design an optimal policy to incentivize the largest improvements inside the agent population. We also extend the model to settings where 1) agents may be dishonest and game the algorithm into making favorable but erroneous decisions; 2) honest efforts are forgettable and not sufficient to guarantee persistent improvements. With the extended models, we further examine conditions under which agents prefer honest efforts over dishonest behavior and the impacts of forgettable efforts. 
    more » « less
  3. Etessami, Kousha; Feige, Uriel; Puppis, Gabriele (Ed.)
    Online allocation is a broad class of problems where items arriving online have to be allocated to agents who have a fixed utility/cost for each assigned item so to maximize/minimize some objective. This framework captures a broad range of fundamental problems such as the Santa Claus problem (maximizing minimum utility), Nash welfare maximization (maximizing geometric mean of utilities), makespan minimization (minimizing maximum cost), minimization of 𝓁_p-norms, and so on. We focus on divisible items (i.e., fractional allocations) in this paper. Even for divisible items, these problems are characterized by strong super-constant lower bounds in the classical worst-case online model. In this paper, we study online allocations in the learning-augmented setting, i.e., where the algorithm has access to some additional (machine-learned) information about the problem instance. We introduce a general algorithmic framework for learning-augmented online allocation that produces nearly optimal solutions for this broad range of maximization and minimization objectives using only a single learned parameter for every agent. As corollaries of our general framework, we improve prior results of Lattanzi et al. (SODA 2020) and Li and Xian (ICML 2021) for learning-augmented makespan minimization, and obtain the first learning-augmented nearly-optimal algorithms for the other objectives such as Santa Claus, Nash welfare, 𝓁_p-minimization, etc. We also give tight bounds on the resilience of our algorithms to errors in the learned parameters, and study the learnability of these parameters. 
    more » « less
  4. Real-world online matching markets (OMMs) often involve multiple objectives, such as maximizing relevance and diversity in online recommendation and crowdsourcing systems. In this paper, we propose a generic bi-objective maximization model for OMMs with the following features: (1) there are two types of agents—offline and online—with online agents arriving dynamically and stochastically; (2) upon each online agent’s arrival, an immediate and irrevocable decision must be made regarding which subset of relevant offline agents to assign; and (3) each offline and online agent has a specific matching capacity, i.e., an upper bound on the number of allowable matchings. Our model supports two general linear objective functions defined over all possible assignments to online agents. We formulate a bi-objective linear program (LP) and design an LP-based parameterized algorithm. Departing from prevalent non-adaptive attenuation methods, we introduce a time-adaptive attenuation framework that achieves an almost tight competitive ratio for each objective. To complement our theoretical analysis, we implement the proposed algorithm and evaluate it against several heuristics using two real-world datasets. Extensive experimental results demonstrate the flexibility and effectiveness of our approach, validating our theoretical predictions. 
    more » « less
  5. We study the problem of identifying the best action in a sequential decision-making setting when the reward distributions of the arms exhibit a non-trivial dependence structure, which is governed by the underlying causal model of the domain where the agent is deployed. In this setting, playing an arm corresponds to intervening on a set of variables and setting them to specific values. In this paper, we show that whenever the underlying causal model is not taken into account during the decision-making process, the standard strategies of simultaneously intervening on all variables or on all the subsets of the variables may, in general, lead to suboptimal policies, regardless of the number of interventions performed by the agent in the environment. We formally acknowledge this phenomenon and investigate structural properties implied by the underlying causal model, which lead to a complete characterization of the relationships between the arms’ distributions. We leverage this characterization to build a new algorithm that takes as input a causal structure and finds a minimal, sound, and complete set of qualified arms that an agent should play to maximize its expected reward. We empirically demonstrate that the new strategy learns an optimal policy and leads to orders of magnitude faster convergence rates when compared with its causal-insensitive counterparts. 
    more » « less