skip to main content


Title: Interactive Learning with Pricing for Optimal and Stable Allocations in Markets
Large-scale online recommendation systems must facilitate the allocation of a limited number of items among competing users while learning their preferences from user feedback. As a principled way of incorporating market constraints and user incentives in the design, we consider our objectives to be two-fold: maximal social welfare with minimal instability. To maximize social welfare, our proposed framework enhances the quality of recommendations by exploring allocations that optimistically maximize the rewards. To minimize instability, a measure of users' incentives to deviate from recommended allocations, the algorithm prices the items based on a scheme derived from the Walrasian equilibria. Though it is known that these equilibria yield stable prices for markets with known user preferences, our approach accounts for the inherent uncertainty in the preferences and further ensures that the users accept their recommendations under offered prices. To the best of our knowledge, our approach is the first to integrate techniques from combinatorial bandits, optimal resource allocation, and collaborative filtering to obtain an algorithm that achieves sub-linear social welfare regret as well as sub-linear instability. Empirical studies on synthetic and real-world data also demonstrate the efficacy of our strategy compared to approaches that do not fully incorporate all these aspects.  more » « less
Award ID(s):
2007669
NSF-PAR ID:
10471694
Author(s) / Creator(s):
; ;
Publisher / Repository:
PMLR
Date Published:
Journal Name:
Proceedings of the International Workshop on Artificial Intelligence and Statistics
ISSN:
1525-531X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We study the allocation of divisible goods to competing agents via a market mechanism, focusing on agents with Leontief utilities. The majority of the economics and mechanism design literature has focused on \emph{linear} prices, meaning that the cost of a good is proportional to the quantity purchased. Equilibria for linear prices are known to be exactly the maximum Nash welfare allocations. \emph{Price curves} allow the cost of a good to be any (increasing) function of the quantity purchased. We show that price curve equilibria are not limited to maximum Nash welfare allocations with two main results. First, we show that an allocation can be supported by strictly increasing price curves if and only if it is \emph{group-domination-free}. A similarly characterization holds for weakly increasing price curves. We use this to show that given any allocation, we can compute strictly (or weakly) increasing price curves that support it (or show that none exist) in polynomial time. These results involve a connection to the \emph{agent-order matrix} of an allocation, which may have other applications. Second, we use duality to show that in the bandwidth allocation setting, any allocation maximizing a CES welfare function can be supported by price curves. 
    more » « less
  2. We study the problem of approximating maximum Nash social welfare (NSW) when allocatingmindivisible items amongnasymmetric agents with submodular valuations. TheNSWis a well-established notion of fairness and efficiency, defined as the weighted geometric mean of agents’ valuations. For special cases of the problem with symmetric agents and additive(-like) valuation functions, approximation algorithms have been designed using approaches customized for these specific settings, and they fail to extend to more general settings. Hence, no approximation algorithm with a factor independent ofmwas known either for asymmetric agents with additive valuations or for symmetric agents beyond additive(-like) valuations before this work.

    In this article, we extend our understanding of theNSWproblem to far more general settings. Our main contribution is two approximation algorithms for asymmetric agents with additive and submodular valuations. Both algorithms are simple to understand and involve non-trivial modifications of a greedy repeated matchings approach. Allocations of high-valued items are done separately by un-matching certain items and re-matching them by different processes in both algorithms. We show that these approaches achieve approximation factors ofO(n) andO(nlogn) for additive and submodular cases, independent of the number of items. For additive valuations, our algorithm outputs an allocation that also achieves the fairness property of envy-free up to one item (EF1).

    Furthermore, we show that theNSWproblem under submodular valuations is strictly harder than all currently known settings with an\(\frac{\mathrm{e}}{\mathrm{e}-1}\)factor of the hardness of approximation, even for constantly many agents. For this case, we provide a different approximation algorithm that achieves a factor of\(\frac{\mathrm{e}}{\mathrm{e}-1}\), hence resolving it completely.

     
    more » « less
  3. We study the problem of approximating maximum Nash social welfare (NSW) when allocating m indivisible items among n asymmetric agents with submodular valuations. The NSW is a well-established notion of fairness and efficiency, defined as the weighted geometric mean of agents' valuations. For special cases of the problem with symmetric agents and additive(-like) valuation functions, approximation algorithms have been designed using approaches customized for these specific settings, and they fail to extend to more general settings. Hence, no approximation algorithm with factor independent of m is known either for asymmetric agents with additive valuations or for symmetric agents beyond additive(-like) valuations. In this paper, we extend our understanding of the NSW problem to far more general settings. Our main contribution is two approximation algorithms for asymmetric agents with additive and submodular valuations respectively. Both algorithms are simple to understand and involve non-trivial modifications of a greedy repeated matchings approach. Allocations of high valued items are done separately by un-matching certain items and re-matching them, by processes that are different in both algorithms. We show that these approaches achieve approximation factors of O(n) and O(n log n) for additive and submodular case respectively, which is independent of the number of items. For additive valuations, our algorithm outputs an allocation that also achieves the fairness property of envy-free up to one item (EF1). Furthermore, we show that the NSW problem under submodular valuations is strictly harder than all currently known settings with an e/(e-1) factor of the hardness of approximation, even for constantly many agents. For this case, we provide a different approximation algorithm that achieves a factor of e/(e-1), hence resolving it completely. 
    more » « less
  4. Online pricing has been the focus of extensive research in recent years, particularly in the context of selling an item to sequentially arriving users. However, what if a provider wants to maximize revenue by selling multiple items to multiple users in each round? This presents a complex problem, as the provider must intelligently offer the items to those users who value them the most without exceeding their highest acceptable prices. In this study, we tackle this challenge by designing online algorithms that can efficiently offer and price items while learning user valuations from accept/reject feedback. We focus on three user valuation models (fixed valuations, random experiences, and random valuations) and provide algorithms with nearly-optimal revenue regret guarantees. In particular, for any market setting with N users, M items, and load L (which roughly corresponds to the maximum number of simultaneous allocations possible), our algorithms achieve regret of order O(NMloglog(LT)) under fixed valuations model, O(√NMLT) under random experiences model and O(√NMLT) under random valuations model in T rounds. 
    more » « less
  5. null (Ed.)
    We consider a demand management problem in an energy community, in which several users obtain energy from an external organization such as an energy company and pay for the energy according to pre-specified prices that consist of a time-dependent price per unit of energy as well as a separate price for peak demand. Since users’ utilities are their private information, which they may not be willing to share, a mediator, known as the planner, is introduced to help optimize the overall satisfaction of the community (total utility minus total payments) by mechanism design. A mechanism consists of a message space, a tax/subsidy, and an allocation function for each user. Each user reports a message chosen from her own message space, then receives some amount of energy determined by the allocation function, and pays the tax specified by the tax function. A desirable mechanism induces a game, the Nash equilibria (NE), of which results in an allocation that coincides with the optimal allocation for the community. As a starting point, we design a mechanism for the energy community with desirable properties such as full implementation, strong budget balance and individual rationality for both users and the planner. We then modify this baseline mechanism for communities where message exchanges are allowed only within neighborhoods, and consequently, the tax/subsidy and allocation functions of each user are only determined by the messages from their neighbors. All of the desirable properties of the baseline mechanism are preserved in the distributed mechanism. Finally, we present a learning algorithm for the baseline mechanism, based on projected gradient descent, that is guaranteed to converge to the NE of the induced game. 
    more » « less