skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Risk-Sensitive Learning and Pricing for Demand Response
We consider the setting in which an electric power utility seeks to curtail its peak electricity demand by offering a fixed group of customers a uniform price for reductions in consumption relative to their predetermined baselines. The underlying demand curve, which describes the aggregate reduction in consumption in response to the offered price, is assumed to be affine and subject to unobservable random shocks. Assuming that both the parameters of the demand curve and the distribution of the random shocks are initially unknown to the utility, we investigate the extent to which the utility might dynamically adjust its offered prices to maximize its cumulative risk-sensitive payoff over a finite number of T days. In order to do so effectively, the utility must design its pricing policy to balance the tradeoff between the need to learn the unknown demand model (exploration) and maximize its payoff (exploitation) over time. In this paper, we propose such a pricing policy, which is shown to exhibit an expected payoff loss over T days that is at most O( p T), relative to an oracle pricing policy that knows the underlying demand model. Moreover, the proposed pricing policy is shown to yield a sequence of prices that converge to the oracle optimal prices in the mean square sense.  more » « less
Award ID(s):
1632124
PAR ID:
10043033
Author(s) / Creator(s):
;
Date Published:
Journal Name:
IEEE Transactions on Smart Grid
ISSN:
1949-3053
Page Range / eLocation ID:
1 to 1
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider the setting in which an electric power utility seeks to curtail its peak electricity demand by offering a fixed group of customers a uniform price for reductions in consumption relative to their predetermined baselines. The underlying demand curve, which describes the aggregate reduction in consumption in response to the offered price, is assumed to be affine and subject to unobservable random shocks. Assuming that both the parameters of the demand curve and the distribution of the random shocks are initially unknown to the utility, we investigate the extent to which the utility might dynamically adjust its offered prices to maximize its cumulative risk-sensitive payoff over a finite number of T days. In order to do so effectively, the utility must design its pricing policy to balance the tradeoff between the need to learn the unknown demand model (exploration) and maximize its payoff (exploitation) over time. In this paper, we propose such a pricing policy, which is shown to exhibit an expected payoff loss over T days that is at most O(√T), relative to an oracle who knows the underlying demand model. Moreover, the proposed pricing policy is shown to yield a sequence of prices that converge to the oracle optimal prices in the mean square sense. 
    more » « less
  2. null (Ed.)
    The prevalence of e-commerce has made customers’ detailed personal information readily accessible to retailers, and this information has been widely used in pricing decisions. When using personalized information, the question of how to protect the privacy of such information becomes a critical issue in practice. In this paper, we consider a dynamic pricing problem over T time periods with an unknown demand function of posted price and personalized information. At each time t, the retailer observes an arriving customer’s personal information and offers a price. The customer then makes the purchase decision, which will be utilized by the retailer to learn the underlying demand function. There is potentially a serious privacy concern during this process: a third-party agent might infer the personalized information and purchase decisions from price changes in the pricing system. Using the fundamental framework of differential privacy from computer science, we develop a privacy-preserving dynamic pricing policy, which tries to maximize the retailer revenue while avoiding information leakage of individual customer’s information and purchasing decisions. To this end, we first introduce a notion of anticipating [Formula: see text]-differential privacy that is tailored to the dynamic pricing problem. Our policy achieves both the privacy guarantee and the performance guarantee in terms of regret. Roughly speaking, for d-dimensional personalized information, our algorithm achieves the expected regret at the order of [Formula: see text] when the customers’ information is adversarially chosen. For stochastic personalized information, the regret bound can be further improved to [Formula: see text]. This paper was accepted by J. George Shanthikumar, big data analytics. 
    more » « less
  3. We consider a general non-stochastic online pricing bandit setting in a procurement scenario where a buyer with a budget wants to procure items from a fixed set of sellers to maximize the buyer's reward by dynamically offering purchasing prices to the sellers, where the sellers' costs and values at each time period can change arbitrarily and the sellers determine whether to accept the offered prices to sell the items. This setting models online pricing scenarios of procuring resources or services in multi-agent systems. We first consider the offline setting when sellers' costs and values are known in advance and investigate the best fixed-price policy in hindsight. We show that it has a tight approximation guarantee with respect to the offline optimal solutions. In the general online setting, we propose an online pricing policy, Granularity-based Pricing (GAP), which exploits underlying side-information from the feedback graph when the budget is given as the input. We show that GAP achieves an upper bound of O(n{v_{max}}{c_{min}}sqrt{B/c_{min}}ln B) on the alpha-regret where n, v_{max}, c_{min}, and B are the number, the maximum value, the minimum cost of sellers, and the budget, respectively. We then extend it to the unknown budget case by developing a variant of GAP, namely Doubling-GAP, and show its alpha-regret is at most O(n{v_{max}}{c_{min}}sqrt{B/c_{min}}ln2 B). We also provide an alpha-regret lower bound Omega(v_{max}sqrt{Bn/c_{min}}) of any online policy that is tight up to sub-linear terms. We conduct simulation experiments to show that the proposed policy outperforms the baseline algorithms. 
    more » « less
  4. Problem definition: We study a feature-based pricing problem with demand censoring in an offline, data-driven setting. In this problem, a firm is endowed with a finite amount of inventory and faces a random demand that is dependent on the offered price and the features (from products, customers, or both). Any unsatisfied demand that exceeds the inventory level is lost and unobservable. The firm does not know the demand function but has access to an offline data set consisting of quadruplets of historical features, inventory, price, and potentially censored sales quantity. Our objective is to use the offline data set to find the optimal feature-based pricing rule so as to maximize the expected profit. Methodology/results: Through the lens of causal inference, we propose a novel data-driven algorithm that is motivated by survival analysis and doubly robust estimation. We derive a finite sample regret bound to justify the proposed offline learning algorithm and prove its robustness. Numerical experiments demonstrate the robust performance of our proposed algorithm in accurately estimating optimal prices on both training and testing data. Managerial implications: The work provides practitioners with an innovative modeling and algorithmic framework for the feature-based pricing problem with demand censoring through the lens of causal inference. Our numerical experiments underscore the value of considering demand censoring in the context of feature-based pricing. Funding: The research of E. Fang is partially supported by the National Science Foundation [Grants NSF DMS-2346292, NSF DMS-2434666] and the Whitehead Scholarship. The research of C. Shi is partially supported by the Amazon Research Award. Supplemental Material: The online appendix is available at https://doi.org/10.1287/msom.2024.1061 . 
    more » « less
  5. In online sales, sellers usually offer each potential buyer a posted price in a take-it-or-leave fashion. Buyers can sometimes see posted prices faced by other buyers, and changing the price frequently could be considered unfair. The literature on posted-price mechanisms and prophet inequality problems has studied the two extremes of pricing policies, the fixed-price policy and fully dynamic pricing. The former is suboptimal in revenue but is perceived as fairer than the latter. This work examines the middle situation, where there are at most k distinct prices over the selling horizon. Using the framework of prophet inequalities with independent and identically distributed random variables, we propose a new prophet inequality for strategies that use at most k thresholds. We present asymptotic results in k and results for small values of k. For k = 2 prices, we show an improvement of at least 11% over the best fixed-price solution. Moreover, k = 5 prices suffice to guarantee almost 99% of the approximation factor obtained by a fully dynamic policy that uses an arbitrary number of prices. From a technical standpoint, we use an infinite-dimensional linear program in our analysis; this formulation could be of independent interest to other online selection problems. 
    more » « less