skip to main content

Search for: All records

Award ID contains: 2046640

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available December 29, 2024
  2. Function approximation (FA) has been a critical component in solving large zero-sum games. Yet, little attention has been given towards FA in solving general-sum extensive-form games, despite them being widely regarded as being computationally more challenging than their fully competitive or cooperative counterparts. A key challenge is that for many equilibria in general-sum games, no simple analogue to the state value function used in Markov Decision Processes and zero-sum games exists. In this paper, we propose learning the Enforceable Payoff Frontier (EPF)---a generalization of the state value function for general-sum games. We approximate the optimal Stackelberg extensive-form correlated equilibrium by representing EPFs with neural networks and training them by using appropriate backup operations and loss functions. This is the first method that applies FA to the Stackelberg setting, allowing us to scale to much larger games while still enjoying performance guarantees based on FA error. Additionally, our proposed method guarantees incentive compatibility and is easy to evaluate without having to depend on self-play or approximate best-response oracles.

    more » « less
  3. Non-governmental organizations for environmental conservation have a significant interest in monitoring conservation-related media and getting timely updates about infrastructure construction projects as they may cause massive impact to key conservation areas. Such monitoring, however, is difficult and time-consuming. We introduce NewsPanda, a toolkit which automatically detects and analyzes online articles related to environmental conservation and infrastructure construction. We fine-tune a BERT-based model using active learning methods and noise correction algorithms to identify articles that are relevant to conservation and infrastructure construction. For the identified articles, we perform further analysis, extracting keywords and finding potentially related sources. NewsPanda has been successfully deployed by the World Wide Fund for Nature teams in the UK, India, and Nepal since February 2022. It currently monitors over 80,000 websites and 1,074 conservation sites across India and Nepal, saving more than 30 hours of human efforts weekly. We have now scaled it up to cover 60,000 conservation sites globally.

    more » « less
  4. Correlated Equilibrium is a solution concept that is more general than Nash Equilibrium (NE) and can lead to outcomes with better social welfare. However, its natural extension to the sequential setting, the Extensive Form Correlated Equilibrium (EFCE), requires a quadratic amount of space to solve, even in restricted settings without randomness in nature. To alleviate these concerns, we apply subgame resolving, a technique extremely successful in finding NE in zero-sum games to solving general-sum EFCEs. Subgame resolving refines a correlation plan in an online manner: instead of solving for the full game upfront, it only solves for strategies in subgames that are reached in actual play, resulting in significant computational gains. In this paper, we (i) lay out the foundations to quantify the quality of a refined strategy, in terms of the social welfare and exploitability of correlation plans, (ii) show that EFCEs possess a sufficient amount of independence between subgames to perform resolving efficiently, and (iii) provide two algorithms for resolving, one using linear programming and the other based on regret minimization. Both methods guarantee safety, i.e., they will never be counterproductive. Our methods are the first time an online method has been applied to the correlated, general-sum setting. 
    more » « less
  5. Food waste and insecurity are two societal challenges that coexist in many parts of the world. A prominent force to combat these issues, food rescue platforms match food donations to organizations that serve underprivileged communities, and then rely on external volunteers to transport the food. Previous work has developed machine learning models for food rescue volunteer engagement. However, having long worked with domain practitioners to deploy AI tools to help with food rescues, we understand that there are four main pain points that keep such a machine learning model from being actually useful in practice: small data, data collected only under the default intervention, unmodeled objectives due to communication gap, and unforeseen consequences of the intervention. In this paper, we introduce bandit data-driven optimization which not only helps address these pain points in food rescue, but also is applicable to other nonprofit domains that share similar challenges. Bandit data-driven optimization combines the advantages of online bandit learning and offline predictive analytics in an integrated framework. We propose PROOF, a novel algorithm for this framework and formally prove that it has no-regret. We show that PROOF performs better than existing baseline on food rescue volunteer recommendation. 
    more » « less