This content will become publicly available on July 11, 2025
- PAR ID:
- 10544885
- Publisher / Repository:
- EC'24
- Date Published:
- Format(s):
- Medium: X
- Location:
- New Haven, CT, USA
- Sponsoring Org:
- National Science Foundation
More Like this
-
null (Ed.)The prevalence of e-commerce has made customersâ detailed personal information readily accessible to retailers, and this information has been widely used in pricing decisions. When using personalized information, the question of how to protect the privacy of such information becomes a critical issue in practice. In this paper, we consider a dynamic pricing problem over T time periods with an unknown demand function of posted price and personalized information. At each time t, the retailer observes an arriving customerâs personal information and offers a price. The customer then makes the purchase decision, which will be utilized by the retailer to learn the underlying demand function. There is potentially a serious privacy concern during this process: a third-party agent might infer the personalized information and purchase decisions from price changes in the pricing system. Using the fundamental framework of differential privacy from computer science, we develop a privacy-preserving dynamic pricing policy, which tries to maximize the retailer revenue while avoiding information leakage of individual customerâs information and purchasing decisions. To this end, we first introduce a notion of anticipating [Formula: see text]-differential privacy that is tailored to the dynamic pricing problem. Our policy achieves both the privacy guarantee and the performance guarantee in terms of regret. Roughly speaking, for d-dimensional personalized information, our algorithm achieves the expected regret at the order of [Formula: see text] when the customersâ information is adversarially chosen. For stochastic personalized information, the regret bound can be further improved to [Formula: see text]. This paper was accepted by J. George Shanthikumar, big data analytics.more » « less
-
null (Ed.)We study the dynamic assortment planning problem, where for each arriving customer, the seller offers an assortment of substitutable products and the customer makes the purchase among offered products according to an uncapacitated multinomial logit (MNL) model. Because all the utility parameters of the MNL model are unknown, the seller needs to simultaneously learn customersâ choice behavior and make dynamic decisions on assortments based on the current knowledge. The goal of the seller is to maximize the expected revenue, or, equivalently, to minimize the expected regret. Although dynamic assortment planning problem has received an increasing attention in revenue management, most existing policies require the estimation of mean utility for each product and the final regret usually involves the number of products [Formula: see text]. The optimal regret of the dynamic assortment planning problem under the most basic and popular choice modelâthe MNL modelâis still open. By carefully analyzing a revenue potential function, we develop a trisection-based policy combined with adaptive confidence bound construction, which achieves an item-independent regret bound of [Formula: see text], where [Formula: see text] is the length of selling horizon. We further establish the matching lower bound result to show the optimality of our policy. There are two major advantages of the proposed policy. First, the regret of all our policies has no dependence on [Formula: see text]. Second, our policies are almost assumption-free: there is no assumption on mean utility nor any âseparabilityâ condition on the expected revenues for different assortments. We also extend our trisection search algorithm to capacitated MNL models and obtain the optimal regret [Formula: see text] (up to logrithmic factors) without any assumption on the mean utility parameters of items.more » « less
-
Abstract We study optimal pricing for tandem queueing systems with finite buffers. The service provider dynamically quotes prices to incoming price sensitive customers to maximize the longârun average revenue. We present a Markov decision process model for the optimization problem. For systems with two stations, generalâsized buffers, and two or more prices, we describe the structure of the optimal dynamic pricing policy and develop tailored policy iteration algorithms to find an optimal pricing policy. For systems with two stations but no intermediate buffer, we characterize conditions under which quoting either a high or a low price to all customers is optimal and provide an easyâtoâimplement algorithm to solve the problem. Numerical experiments are conducted to compare the developed algorithms with the regular policy iteration algorithm. The work also discusses possible extensions of the obtained results to both threeâstation systems and twoâstation systems with price and congestion sensitive customers using numerical analysis.
-
We consider the setting in which an electric power utility seeks to curtail its peak electricity demand by offering a fixed group of customers a uniform price for reductions in consumption relative to their predetermined baselines. The underlying demand curve, which describes the aggregate reduction in consumption in response to the offered price, is assumed to be affine and subject to unobservable random shocks. Assuming that both the parameters of the demand curve and the distribution of the random shocks are initially unknown to the utility, we investigate the extent to which the utility might dynamically adjust its offered prices to maximize its cumulative risk-sensitive payoff over a finite number of T days. In order to do so effectively, the utility must design its pricing policy to balance the tradeoff between the need to learn the unknown demand model (exploration) and maximize its payoff (exploitation) over time. In this paper, we propose such a pricing policy, which is shown to exhibit an expected payoff loss over T days that is at most O( p T), relative to an oracle pricing policy that knows the underlying demand model. Moreover, the proposed pricing policy is shown to yield a sequence of prices that converge to the oracle optimal prices in the mean square sense.more » « less
-
Contextual dynamic pricing aims to set personalized prices based on sequential interactions with customers. At each time period, a customer who is interested in purchasing a product comes to the platform. The customerâs valuation for the product is a linear function of contexts, including product and customer features, plus some random market noise. The seller does not observe the customerâs true valuation, but instead needs to learn the valuation by leveraging contextual information and historic binary purchase feedback. Existing models typically assume full or partial knowledge of the random noise distribution. In this paper, we consider contextual dynamic pricing with unknown random noise in the linear valuation model. Our distribution-free pricing policy learns both the contextual function and the market noise simultaneously. A key ingredient of our method is a novel perturbed linear bandit framework, in which a modified linear upper confidence bound algorithm is proposed to balance the exploration of market noise and the exploitation of the current knowledge for better pricing. We establish the regret upper bound and a matching lower bound of our policy in the perturbed linear bandit framework and prove a sublinear regret bound in the considered pricing problem. Finally, we demonstrate the superior performance of our policy on simulations and a real-life auto loan data set. Funding: Y. Liu and W.W. Sun acknowledge support from the National Science Foundation Division of Social and Economic Sciences [Grant NSF-SES 2217440]. Supplemental Material: The supplementary material is available at https://doi.org/10.1287/moor.2023.1369 .more » « less