Efficient and truthful mechanisms to price time on remote servers/machines have been the subject of much work in recent years due to the importance of the cloud market. This paper considers online revenue maximization for a unit capacity server, when jobs are non preemptive, in the Bayesian setting: at each time step, one job arrives, with parameters drawn from an underlying distribution.We design an efficiently computable truthful posted price mechanism, which maximizes revenue in expectation and in retrospect, up to additive error. The prices are posted prior to learning the agent's type, and the computed pricing scheme is deterministic.We also show the pricing mechanism is robust to learning the job distribution from samples, where polynomially many samples suffice to obtain near optimal prices.
more » « less- Award ID(s):
- 1750436
- PAR ID:
- 10223187
- Date Published:
- Journal Name:
- Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20)
- Page Range / eLocation ID:
- 4106 to 4112
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract Efficient and truthful mechanisms to price resources on servers/machines have been the subject of much work in recent years due to the importance of the cloud market. This paper considers revenue maximization in the online stochastic setting with non-preemptive jobs and a unit capacity server. One agent/job arrives at every time step, with parameters drawn from the underlying distribution. We design a posted-price mechanism which can be efficiently computed and is revenue-optimal in expectation and in retrospect, up to additive error. The prices are posted prior to learning the agent’s type, and the computed pricing scheme is deterministic, depending only on the length of the allotted time interval and on the earliest time the server is available. We also prove that the proposed pricing strategy is robust to imprecise knowledge of the job distribution and that a distribution learned from polynomially many samples is sufficient to obtain a near-optimal truthful pricing strategy.more » « less
-
We consider a dynamic pricing problem where customer response to the current price is impacted by the customer price expectation, aka reference price. We study a simple and novel reference price mechanism where reference price is the average of the past prices offered by the seller. As opposed to the more commonly studied exponential smoothing mechanism, in our reference price mechanism the prices offered by seller have a longer-term effect on the future customer expectations. We show that under this mechanism, a markdown policy is near-optimal irrespective of the parameters of the model. This matches the common intuition that a seller may be better off by starting with a higher price and then decreasing it, as the customers feel like they are getting bargains on items that are ordinarily more expensive. For linear demand models, we also provide a detailed characterization of the near-optimal markdown policy along with an efficient way of computing it. We then consider a more challenging dynamic pricing and learning problem, where the demand model parameters are apriori unknown, and the seller needs to learn them online from the customers’ responses to the offered prices while simultaneously optimizing revenue. The objective is to minimize regret, i.e., the 𝑇-round revenue loss compared to a clairvoyant optimal policy. This task essentially amounts to learning a non-stationary optimal policy in a time-variant Markov Decision Process (MDP). For linear demand models, we provide an efficient learning algorithm with an optimal 𝑂(√𝑇 ) regret upper bound.more » « less
-
Sequential posted pricing auctions are popular because of their simplicity in practice and their tractability in theory. A usual assumption in their study is that the Bayesian prior distributions of the buyers are known to the seller, while in reality these priors can only be accessed from historical data. To overcome this assumption, we study sequential posted pricing in the bandit learning model, where the seller interacts with n buyers over T rounds: In each round the seller posts n prices for the n buyers and the first buyer with a valuation higher than the price takes the item. The only feedback that the seller receives in each round is the revenue. Our main results obtain nearly-optimal regret bounds for single-item sequential posted pricing in the bandit learning model. In particular, we achieve an Õ (𝗉𝗈𝗅𝗒(n)T‾^{1/2}) regret for buyers with (Myerson's) regular distributions and an Õ (𝗉𝗈𝗅𝗒(n)T^{2/3}) regret for buyers with general distributions, both of which are tight in the number of rounds T. Our result for regular distributions was previously not known even for the single-buyer setting and relies on a new half-concavity property of the revenue function in the value space. For n sequential buyers, our technique is to run a generalized single-buyer algorithm for all the buyers and to carefully bound the regret from the sub-optimal pricing of the suffix buyers.more » « less
-
Motivated by demand-responsive parking pricing systems, we consider posted-price algorithms for the online metric matching prob- lem. We give an O(log n)-competitive posted-price randomized algorithm in the case that the metric space is a line. In particular, in this setting we show how to implement the ubiquitous guess-and-double technique using prices.more » « less
-
Abstract We study optimal pricing for tandem queueing systems with finite buffers. The service provider dynamically quotes prices to incoming price sensitive customers to maximize the long‐run average revenue. We present a Markov decision process model for the optimization problem. For systems with two stations, general‐sized buffers, and two or more prices, we describe the structure of the optimal dynamic pricing policy and develop tailored policy iteration algorithms to find an optimal pricing policy. For systems with two stations but no intermediate buffer, we characterize conditions under which quoting either a high or a low price to all customers is optimal and provide an easy‐to‐implement algorithm to solve the problem. Numerical experiments are conducted to compare the developed algorithms with the regular policy iteration algorithm. The work also discusses possible extensions of the obtained results to both three‐station systems and two‐station systems with price and congestion sensitive customers using numerical analysis.