skip to main content


The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Thursday, January 16 until 2:00 AM ET on Friday, January 17 due to maintenance. We apologize for the inconvenience.

Title: Revenue maximization in two‐station tandem queueing systems

We study optimal pricing for tandem queueing systems with finite buffers. The service provider dynamically quotes prices to incoming price sensitive customers to maximize the long‐run average revenue. We present a Markov decision process model for the optimization problem. For systems with two stations, general‐sized buffers, and two or more prices, we describe the structure of the optimal dynamic pricing policy and develop tailored policy iteration algorithms to find an optimal pricing policy. For systems with two stations but no intermediate buffer, we characterize conditions under which quoting either a high or a low price to all customers is optimal and provide an easy‐to‐implement algorithm to solve the problem. Numerical experiments are conducted to compare the developed algorithms with the regular policy iteration algorithm. The work also discusses possible extensions of the obtained results to both three‐station systems and two‐station systems with price and congestion sensitive customers using numerical analysis.

more » « less
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Naval Research Logistics (NRL)
Page Range / eLocation ID:
p. 77-107
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We consider a dynamic pricing problem where customer response to the current price is impacted by the customer price expectation, aka reference price. We study a simple and novel reference price mechanism where reference price is the average of the past prices offered by the seller. As opposed to the more commonly studied exponential smoothing mechanism, in our reference price mechanism the prices offered by seller have a longer-term effect on the future customer expectations. We show that under this mechanism, a markdown policy is near-optimal irrespective of the parameters of the model. This matches the common intuition that a seller may be better off by starting with a higher price and then decreasing it, as the customers feel like they are getting bargains on items that are ordinarily more expensive. For linear demand models, we also provide a detailed characterization of the near-optimal markdown policy along with an efficient way of computing it. We then consider a more challenging dynamic pricing and learning problem, where the demand model parameters are apriori unknown, and the seller needs to learn them online from the customers’ responses to the offered prices while simultaneously optimizing revenue. The objective is to minimize regret, i.e., the 𝑇-round revenue loss compared to a clairvoyant optimal policy. This task essentially amounts to learning a non-stationary optimal policy in a time-variant Markov Decision Process (MDP). For linear demand models, we provide an efficient learning algorithm with an optimal 𝑂(√𝑇 ) regret upper bound. 
    more » « less
  2. In the context of subscription-based services, many technologies improve over time, and service providers can provide increasingly powerful service upgrades to their customers but at a launching cost and the expense of the sales of existing products. We propose a model of technology upgrades and characterize the optimal pricing and timing of technology introductions for a service provider who price-discriminates among customers based on their upgrade experience in the face of customers who are averse to switching to improved offerings. We first characterize optimal discriminatory pricing for the infinite horizon pricing problem with fixed introduction times. We reduce the optimal pricing problem to a tractable optimization problem and propose an efficient algorithm for solving it. Our algorithm computes optimal discriminatory prices within a fraction of a second even for large problem instances. We then show that periodic introduction times, combined with optimal pricing, enjoy optimality guarantees. In particular, we first show that, as long as the introduction intervals are constrained to be nonincreasing, it is optimal to have periodic introductions after an initial warm-up phase. When allowing general introduction intervals, we show that periodic introduction intervals after some time are optimal in a more restricted sense. Numerical experiments suggest that it is generally optimal to have periodic introductions after an initial warm-up phase. Finally, we focus on a setting in which the firm does not price-discriminate based on customers’ experience. We show both analytically and numerically that in the nondiscriminatory setting, a simple policy of Myerson (i.e., myopic) pricing and periodic introductions enjoys good performance guarantees. Funding: This material is based upon work supported by INSEAD and University Pierre et Marie Curie [Grant ELICIT], as well as by the National Science Foundation [Grant 2110707]. Supplemental Material: The online appendix is available at . 
    more » « less
  3. We consider the setting in which an electric power utility seeks to curtail its peak electricity demand by offering a fixed group of customers a uniform price for reductions in consumption relative to their predetermined baselines. The underlying demand curve, which describes the aggregate reduction in consumption in response to the offered price, is assumed to be affine and subject to unobservable random shocks. Assuming that both the parameters of the demand curve and the distribution of the random shocks are initially unknown to the utility, we investigate the extent to which the utility might dynamically adjust its offered prices to maximize its cumulative risk-sensitive payoff over a finite number of T days. In order to do so effectively, the utility must design its pricing policy to balance the tradeoff between the need to learn the unknown demand model (exploration) and maximize its payoff (exploitation) over time. In this paper, we propose such a pricing policy, which is shown to exhibit an expected payoff loss over T days that is at most O( p T), relative to an oracle pricing policy that knows the underlying demand model. Moreover, the proposed pricing policy is shown to yield a sequence of prices that converge to the oracle optimal prices in the mean square sense. 
    more » « less
  4. We consider the setting in which an electric power utility seeks to curtail its peak electricity demand by offering a fixed group of customers a uniform price for reductions in consumption relative to their predetermined baselines. The underlying demand curve, which describes the aggregate reduction in consumption in response to the offered price, is assumed to be affine and subject to unobservable random shocks. Assuming that both the parameters of the demand curve and the distribution of the random shocks are initially unknown to the utility, we investigate the extent to which the utility might dynamically adjust its offered prices to maximize its cumulative risk-sensitive payoff over a finite number of T days. In order to do so effectively, the utility must design its pricing policy to balance the tradeoff between the need to learn the unknown demand model (exploration) and maximize its payoff (exploitation) over time. In this paper, we propose such a pricing policy, which is shown to exhibit an expected payoff loss over T days that is at most O(√T), relative to an oracle who knows the underlying demand model. Moreover, the proposed pricing policy is shown to yield a sequence of prices that converge to the oracle optimal prices in the mean square sense. 
    more » « less
  5. We consider a fundamental pricing model in which a fixed number of units of a reusable resource are used to serve customers. Customers arrive to the system according to a stochastic process and, upon arrival, decide whether to purchase the service, depending on their willingness to pay and the current price. The service time during which the resource is used by the customer is stochastic, and the firm may incur a service cost. This model represents various markets for reusable resources, such as cloud computing, shared vehicles, rotable parts, and hotel rooms. In the present paper, we analyze this pricing problem when the firm attempts to maximize a weighted combination of three central metrics: profit, market share, and service level. Under Poisson arrivals, exponential service times, and standard assumptions on the willingness-to-pay distribution, we establish a series of results that characterize the performance of static pricing in such environments. In particular, although an optimal policy is fully dynamic in such a context, we prove that a static pricing policy simultaneously guarantees 78.9% of the profit, market share, and service level from the optimal policy. Notably, this result holds for any service rate and number of units the firm operates. Our proof technique relies on a judicious construction of a static price that is derived directly from the optimal dynamic pricing policy. In the special case in which there are two units and the induced demand is linear, we also prove that the static policy guarantees 95.5% of the profit from the optimal policy. Our numerical findings on a large test bed of instances suggest that the latter result is quite indicative of the profit obtained by the static pricing policy across all parameters. 
    more » « less