In many predictive decision-making scenarios, such as credit scoring and academic testing, a decision-maker must construct a model that accounts for agents' incentives to ``game'' their features in order to receive better decisions. Whereas the strategic classification literature generally assumes that agents' outcomes are not causally dependent on their features (and thus strategic behavior is a form of lying), we join concurrent work in modeling agents' outcomes as a function of their changeable attributes. Our formulation is the first to incorporate a crucial phenomenon: when agents act to change observable features, they may as a side effect perturb unobserved features that causally affect their true outcomes. We consider three distinct desiderata for a decision-maker's model: accurately predicting agents' post-gaming outcomes (accuracy), incentivizing agents to improve these outcomes (improvement), and, in the linear setting, estimating the visible coefficients of the true causal model (causal precision). As our main contribution, we provide the first algorithms for learning accuracy-optimizing, improvement-optimizing, and causal-precision-optimizing linear regression models directly from data, without prior knowledge of agents' possible actions. These algorithms circumvent the hardness result of Miller et al. (2019) by allowing the decision maker to observe agents' responses to a sequence of decision rules, in effect inducing agents to perform causal interventions for free.
more »
« less
On Classification of Strategic Agents Who Can Both Game and Improve
In this work, we consider classification of agents who can both game and improve. For example, people wishing to get a loan may be able to take some actions that increase their perceived credit-worthiness and others that also increase their true credit-worthiness. A decision-maker would like to define a classification rule with few false-positives (does not give out many bad loans) while yielding many true positives (giving out many good loans), which includes encouraging agents to improve to become true positives if possible. We consider two models for this problem, a general discrete model and a linear model, and prove algorithmic, learning, and hardness results for each. For the general discrete model, we give an efficient algorithm for the problem of maximizing the number of true positives subject to no false positives, and show how to extend this to a partial-information learning setting. We also show hardness for the problem of maximizing the number of true positives subject to a nonzero bound on the number of false positives, and that this hardness holds even for a finite-point version of our linear model. We also show that maximizing the number of true positives subject to no false positive is NP-hard in our full linear model. We additionally provide an algorithm that determines whether there exists a linear classifier that classifies all agents accurately and causes all improvable agents to become qualified, and give additional results for low-dimensional data.
more »
« less
- PAR ID:
- 10353446
- Editor(s):
- Celis, L. Elisa
- Date Published:
- Journal Name:
- Symposium on Foundations of Responsible Computing (FORC)
- Volume:
- 3
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract BackgroundGenome-wide association studies (GWAS) seek to identify single nucleotide polymorphisms (SNPs) that cause observed phenotypes. However, with highly correlated SNPs, correlated observations, and the number of SNPs being two orders of magnitude larger than the number of observations, GWAS procedures often suffer from high false positive rates. ResultsWe propose BGWAS, a novel Bayesian variable selection method based on nonlocal priors for linear mixed models specifically tailored for genome-wide association studies. Our proposed method BGWAS uses a novel nonlocal prior for linear mixed models (LMMs). BGWAS has two steps: screening and model selection. The screening step scans through all the SNPs fitting one LMM for each SNP and then uses Bayesian false discovery control to select a set of candidate SNPs. After that, a model selection step searches through the space of LMMs that may have any number of SNPs from the candidate set. A simulation study shows that, when compared to popular GWAS procedures, BGWAS greatly reduces false positives while maintaining the same ability to detect true positive SNPs. We show the utility and flexibility of BGWAS with two case studies: a case study on salt stress in plants, and a case study on alcohol use disorder. ConclusionsBGWAS maintains and in some cases increases the recall of true SNPs while drastically lowering the number of false positives compared to popular SMA procedures.more » « less
-
Consider a principal who wants to search through a space of stochastic solutions for one maximizing their utility. If the principal cannot conduct this search on their own, they may instead delegate this problem to an agent with distinct and potentially misaligned utilities. This is called delegated search, and the principal in such problems faces a mechanism design problem in which they must incentivize the agent to find and propose a solution maximizing the principal's expected utility. Following prior work in this area, we consider mechanisms without payments and aim to achieve a multiplicative approximation of the principal's utility when they solve the problem without delegation. In this work, we investigate a natural and recently studied generalization of this model to multiple agents and find nearly tight bounds on the principal's approximation as the number of agents increases. As one might expect, this approximation approaches 1 with increasing numbers of agents, but, somewhat surprisingly, we show that this is largely not due to direct competition among agents.more » « less
-
Given N geo-located point instances (e.g., crime or disease cases) in a spatial domain, we aim to detect sub-regions (i.e., hotspots) that have a higher probability density of generating such instances than the others. Hotspot detection has been widely used in a variety of important urban applications, including public safety, public health, urban planning, equity, etc. The problem is challenging because its societal applications often have low-tolerance for false positives, and require significance testing which is computationally intensive. In related work, the spatial scan statistic introduced a likelihood ratio based framework for hotspot evaluation and significance testing. However, it fails to consider the effect of spatial nondeterminism, causing many missing detections. Our previous work introduced a nondeterministic normalization based scan statistic to mitigate this issue. However, its robustness against false positives is not stably controlled. To address these limitations, we propose a unified framework which can improve the completeness of results without incurring more false positives. We also propose a reduction algorithm to improve the computational efficiency. Experiment results confirm that the unified framework can greatly improve the recall of hotspot detection without increasing the number of false positives, and the reduction algorithm can greatly reduce execution time.more » « less
-
We study linear Fisher markets with satiation. In these markets, sellers have earning limits, and buyers have utility limits. Beyond applications in economics, they arise in the context of maximizing Nash social welfare when allocating indivisible items to agents. In contrast to markets with either earning or utility limits, markets with both limits have not been studied before. They turn out to have fundamentally different properties. In general, the existence of competitive equilibria is not guaranteed. We identify a natural property of markets (termed money clearing) that implies existence. We show that the set of equilibria is not always convex, answering a question posed in the literature. We design an FPTAS to compute an approximate equilibrium and prove that the problem of computing an exact equilibrium lies in the complexity class continuous local search ([Formula: see text]; i.e., the intersection of polynomial local search ([Formula: see text]) and polynomial parity arguments on directed graphs ([Formula: see text])). For a constant number of buyers or goods, we give a polynomial-time algorithm to compute an exact equilibrium. We show how (approximate) equilibria can be rounded and provide the first constant-factor approximation algorithm (with a factor of 2.404) for maximizing Nash social welfare when agents have capped linear (also known as budget-additive) valuations. Finally, we significantly improve the approximation hardness for additive valuations to [Formula: see text]. Funding: J. Garg was supported by the National Science Foundation [Grant CCF-1942321 (CAREER)]. M. Hoefer was supported by Deutsche Forschungsgemeinschaft [Grants Ho 3831/5-1, Ho 3831/6-1, and Ho 3831/7-1].more » « less
An official website of the United States government

