skip to main content


Title: Algorithms and Learning for Fair Portfolio Design
We consider a variation on the classical finance problem of optimal portfolio design. In our setting, a large population of consumers is drawn from some distribution over risk tolerances, and each consumer must be assigned to a portfolio of lower risk than her tolerance. The consumers may also belong to underlying groups (for instance, of demographic properties or wealth), and the goal is to design a small number of portfolios that are fair across groups in a particular and natural technical sense. Our main results are algorithms for optimal and near-optimal portfolio design for both social welfare and fairness objectives, both with and without assumptions on the underlying group structure. We describe an efficient algorithm based on an internal two-player zero-sum game that learns near-optimal fair portfolios ex ante and show experimentally that it can be used to obtain a small set of fair portfolios ex post as well. For the special but natural case in which group structure coincides with risk tolerances (which models the reality that wealthy consumers generally tolerate greater risk), we give an efficient and optimal fair algorithm. We also provide generalization guarantees for the underlying risk distribution that has no dependence on the number of portfolios and illustrate the theory with simulation results.  more » « less
Award ID(s):
1763307 1934876
NSF-PAR ID:
10267306
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
Economics and Computation (EC) 2021
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Given a set of facilities and clients, and costs to open facilities, the classic facility location problem seeks to open a set of facilities and assign each client to one open facility to minimize the cost of opening the chosen facilities and the total distance of the clients to their assigned open facilities. Such an objective may induce an unequal cost over certain socioeconomic groups of clients (i.e., total distance traveled by clients in such a group). This is important when planning the location of socially relevant facilities such as emergency rooms. In this work, we consider a fair version of the problem where we are given š‘Ÿ clients groups that partition the set of clients, and the distance of a given group is defined as the average distance of clients in the group to their respective open facilities. The objective is to minimize the Minkowski š‘-norm of vector of group distances, to penalize high access costs to open facilities across š‘Ÿ groups of clients. This generalizes classic facility location (š‘ = 1) and the minimization of the maximum group distance (š‘ = āˆž). However, in practice, fairness criteria may not be explicit or even known to a decision maker, and it is often unclear how to select a specific "š‘" to model the cost of unfairness. To get around this, we study the notion of solution portfolios where for a fixed problem instance, we seek a small portfolio of solutions such that for any Minkowski norm š‘, one of these solutions is an š‘‚(1)-approximation. Using the geometric relationship between various š‘-norms, we show the existence of a portfolio of cardinality š‘‚(log š‘Ÿ), and a lower bound of (\sqrt{log r}). There may not be common structure across different solutions in this portfolio, which can make planning difficult if the notion of fairness changes over time or if the budget to open facilities is disbursed over time. For example, small changes in š‘ could lead to a completely different set of open facilities in the portfolio. Inspired by this, we introduce the notion of refinement, which is a family of solutions for each š‘-norm satisfying a combinatorial property. This property requires that (1) the set of facilities open for a higher š‘-norm must be a subset of the facilities open for a lower š‘-norm, and (2) all clients assigned to an open facility for a lower š‘-norm must be assigned to the same open facility for any higher š‘-norm. A refinement is š›¼-approximate if the solution for each š‘-norm problem is an š›¼-approximation for it. We show that it is sufficient to consider only š‘‚(log š‘Ÿ) norms instead of all š‘-norms, š‘ āˆˆ [1, āˆž] to construct refinements. A natural greedy algorithm for the problem gives a poly(š‘Ÿ)-approximate refinement, which we improve to poly(r^1/\sqrt{log š‘Ÿ})-approximate using a recursive algorithm. We improve this ratio to š‘‚(log š‘Ÿ) for the special case of tree metric for uniform facility open cost. Our recursive algorithm extends to other settings, including to a hierarchical facility location problem that models facility location problems at several levels, such as public works departments and schools. A full version of this paper can be found at https://arxiv.org/abs/2211.14873. 
    more » « less
  2. Settings such as lending and policing can be modeled by a centralized agent allocating a scarce resource (e.g. loans or police officers) amongst several groups, in order to maximize some objective (e.g. loans given that are repaid, or criminals that are apprehended). Often in such problems fairness is also a concern. One natural notion of fairness, based on general principles of equality of opportunity, asks that conditional on an individual being a candidate for the resource in question, the probability of actually receiving it is approximately independent of the individualā€™s group. For example, in lending this would mean that equally creditworthy individuals in different racial groups have roughly equal chances of receiving a loan. In policing it would mean that two individuals committing the same crime in different districts would have roughly equal chances of being arrested. In this paper, we formalize this general notion of fairness for allocation problems and investigate its algorithmic consequences. Our main technical results include an efficient learning algorithm that converges to an optimal fair allocation even when the allocator does not know the frequency of candidates (i.e. creditworthy individuals or criminals) in each group. This algorithm operates in a censored feedback model in which only the number of candidates who received the resource in a given allocation can be observed, rather than the true number of candidates in each group. This models the fact that we do not learn the creditworthiness of individuals we do not give loans to and do not learn about crimes committed if the police presence in a district is low. 
    more » « less
  3. Abstract

    Natural resources often exhibit large interannual fluctuations in productivity driven by shifting environmental conditions, and this translates to high variability in the revenue resource users earn. However, users can dampen this variability by harvesting a portfolio of resources. In the context of fisheries, this means targeting multiple populations, though the ability to actually build diverse fishing portfolios is often constrained by the costs and availability of fishing permits. These constraints are generally intended to prevent overcapitalization of the fleet and ensure populations are fished sustainably. As linked humanā€natural systems, both ecological and fishing dynamics influence the specific advantages and disadvantages of increasing the diversity of fishing portfolios. Specifically, a portfolio of synchronous populations with similar responses to environmental drivers should reduce revenue variability less than a portfolio of asynchronous populations with opposite responses. We built a bioeconomic model based on the Dungeness crab (Metacarcinus magister), Chinook salmon (Oncorhynchus tshawytscha), and groundfish fisheries in the California Current, and used it to explore the influence of population synchrony and permit access on income patterns. As expected, synchronous populations reduced revenue variability less than asynchronous populations, but only for portfolios including crab and salmon. Synchrony with the longerā€lived groundfish population was not important because environmentally driven changes in groundfish recruitment were mediated by growth and natural mortality over the full population age structure, and overall biomass was relatively stable across years. Thus, building a portfolio of diverse life histories can buffer against the impacts of poor environmental conditions over short time scales. Increasing access to all permits generally led to increased revenue stability and decreased inequality of the fleet, but also resulted in less revenue earned by an individual from a given portfolio because more vessels shared the available biomass. This means managers are faced with a tradeā€off between the average revenue individuals earn and the risk those individuals accept. These results illustrate the importance of considering connections between social and ecological dynamics when evaluating management options that constrain or facilitate fishersā€™ ability to diversify their fishing.

     
    more » « less
  4. We consider the problem of accurately recovering a matrix B of size M by M, which represents a probability distribution over M^2 outcomes, given access to an observed matrix of "counts" generated by taking independent samples from the distribution B. How can structural properties of the underlying matrix B be leveraged to yield computationally efficient and information theoretically optimal reconstruction algorithms? When can accurate reconstruction be accomplished in the sparse data regime? This basic problem lies at the core of a number of questions that are currently being considered by different communities, including building recommendation systems and collaborative filtering in the sparse data regime, community detection in sparse random graphs, learning structured models such as topic models or hidden Markov models, and the efforts from the natural language processing community to compute "word embeddings". Many aspects of this problem---both in terms of learning and property testing/estimation and on both the algorithmic and information theoretic sides---remain open. Our results apply to the setting where B has a low rank structure. For this setting, we propose an efficient (and practically viable) algorithm that accurately recovers the underlying M by M matrix using O(M) samples} (where we assume the rank is a constant). This linear sample complexity is optimal, up to constant factors, in an extremely strong sense: even testing basic properties of the underlying matrix (such as whether it has rank 1 or 2) requires Omega(M) samples. Additionally, we provide an even stronger lower bound showing that distinguishing whether a sequence of observations were drawn from the uniform distribution over M observations versus being generated by a well-conditioned Hidden Markov Model with two hidden states requires Omega(M) observations, while our positive results for recovering B immediately imply that Omega(M) observations suffice to learn such an HMM. This lower bound precludes sublinear-sample hypothesis tests for basic properties, such as identity or uniformity, as well as sublinear sample estimators for quantities such as the entropy rate of HMMs. 
    more » « less
  5. ABSTRACT

    The presented methodology results in an optimal portfolio of resilienceā€oriented resource allocation under weatherā€related risks. The preā€event mitigations improve the capacity of the transportation system to absorb shocks from future natural hazards, contributing to risk reduction. The postā€event recovery planning results in enhancing the system's ability to bounce back rapidly, promoting network resilience. Considering the complex nature of the problem due to uncertainty of hazards, and the impact of the preā€event decisions on postā€event planning, this study formulates a nonlinear twoā€stage stochastic programming (NTSSP) model, with the objective of minimizing the direct construction investment and indirect costs in both preā€event mitigation and postā€event recovery stages. In the model, the first stage prioritizes a bridge group that will be retrofitted or repaired to improve the system's robustness and redundancy. The second stage elaborates the uncertain occurrence of a type of natural hazard with any potential intensity at any possible network location. The damaged state of the network is dependent on decisions made on firstā€stage mitigation efforts. While there has been research addressing the optimization of preā€event or postā€event efforts, the number of studies addressing two stages in the same framework is limited. Even such studies are limited in their application due to the consideration of small networks with a limited number of assets. The NTSSP model addresses this gap and builds a largeā€scale dataā€driven simulation environment. To effectively solve the NTSSP model, a hybrid heuristic method of evolution strategy with highā€performance parallel computing is applied, through which the evolutionary process is accelerated, and the computing time is reduced as a result. The NTSSP model is implemented in a testā€bed transportation network in Iowa under flood hazards. The results show that the NTSSP model balances the economy and efficiency on risk mitigation within the budgetary investment while constantly providing a resilient system during the full twoā€stage course.

     
    more » « less