Assortment optimization finds many important applications in both brick-and-mortar and online retailing. Decision makers select a subset of products to offer to customers from a universe of substitutable products, based on the assumption that customers purchase according to a Markov chain choice model, which is a very general choice model encompassing many popular models. The existing literature predominantly assumes that the customer arrival process and the Markov chain choice model parameters are given as input to the stochastic optimization model. However, in practice, decision makers may not have this information and must learn them while maximizing the total expected revenue on the fly. In “Online Learning for Constrained Assortment Optimization under the Markov Chain Choice Model,” S. Li, Q. Luo, Z. Huang, and C. Shi developed a series of online learning algorithms for Markov chain choice-based assortment optimization problems with efficiency, as well as provable performance guarantees.
more »
« less
A Statistical Learning Approach to Personalization in Revenue Management
We consider a logit model-based framework for modeling joint pricing and assortment decisions that take into account customer features. This model provides a significant advantage when one has insufficient data for any one customer and wishes to generalize learning about one customer’s preferences to the population. Under this model, we study the statistical learning task of model fitting from a static store of precollected customer data. This setting, in contrast to the popular learning and earning paradigm, represents the situation many business teams encounter in which their data collection abilities have outstripped their data analysis capabilities. In this learning setting, we establish finite-sample convergence guarantees on the model parameters. The parameter convergence guarantees are then extended to out-of-sample performance guarantees in terms of revenue, in the form of a high-probability bound on the gap between the expected revenue of the best action taken under the estimated parameters and the revenue generated by a decision maker with full knowledge of the choice model. We further discuss practical implications of these bounds. We demonstrate the personalization approach using ticket purchase data from an airline carrier. This paper was accepted by J. George Shanthikumar, special issue on data-driven prescriptive analytics
more »
« less
- Award ID(s):
- 1845444
- PAR ID:
- 10301867
- Date Published:
- Journal Name:
- Management Science
- ISSN:
- 0025-1909
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We develop a versatile new methodology for multidimensional mechanism design that incorporates side information about agent types to generate high social welfare and high revenue simultaneously. Prominent sources of side information in practice include predictions from a machine-learning model trained on historical agent data, advice from domain experts, and even the mechanism designer’s own gut instinct. In this paper we adopt a prior-free perspective that makes no assumptions on the correctness, accuracy, or source of the side information. First, we design a meta-mechanism that integrates input side information with an improvement of the classical VCG mechanism. The welfare, revenue, and incentive properties of our meta-mechanism are characterized by novel constructions we introduce based on the notion of a weakest competitor, which is an agent that has the smallest impact on welfare. We show that our meta-mechanism, when carefully instantiated, simultaneously achieves strong welfare and revenue guarantees parameterized by errors in the side information. When the side information is highly informative and accurate, our mechanism achieves welfare and revenue competitive with the total social surplus, and its performance decays continuously and gradually as the quality of the side information decreases. Finally, we apply our meta-mechanism to a setting where each agent’s type is determined by a constant number of parameters. Specifically, agent types lie on constant-dimensional subspaces (of the potentially high-dimensional ambient type space) that are known to the mechanism designer. We use our meta-mechanism to obtain the first known welfare and revenue guarantees in this setting.more » « less
-
This paper presents an extension of Naor’s analysis on the join-or-balk problem in observable M/M/1 queues. Although all other Markovian assumptions still hold, we explore this problem assuming uncertain arrival rates under the distributionally robust settings. We first study the problem with the classical moment ambiguity set, where the support, mean, and mean-absolute deviation of the underlying distribution are known. Next, we extend the model to the data-driven setting, where decision makers only have access to a finite set of samples. We develop three optimal joining threshold strategies from the perspectives of an individual customer, a social optimizer, and a revenue maximizer such that their respective worst-case expected benefit rates are maximized. Finally, we compare our findings with Naor’s original results and the traditional sample average approximation scheme. Funding: This research was supported by the National Science Foundation [Grants 2342505 and 2343869].more » « less
-
We consider dynamic assortment problems with reusable products, in which each arriving customer chooses a product within an offered assortment, uses the product for a random duration of time, and returns the product back to the firm to be used by other customers. The goal is to find a policy for deciding on the assortment to offer to each customer so that the total expected revenue over a finite selling horizon is maximized. The dynamic-programming formulation of this problem requires a high-dimensional state variable that keeps track of the on-hand product inventories, as well as the products that are currently in use. We present a tractable approach to compute a policy that is guaranteed to obtain at least 50% of the optimal total expected revenue. This policy is based on constructing linear approximations to the optimal value functions. When the usage duration is infinite or follows a negative binomial distribution, we also show how to efficiently perform rollout on a simple static policy. Performing rollout corresponds to using separable and nonlinear value function approximations. The resulting policy is also guaranteed to obtain at least 50% of the optimal total expected revenue. The special case of our model with infinite usage durations captures the case where the customers purchase the products outright without returning them at all. Under infinite usage durations, we give a variant of our rollout approach whose total expected revenue differs from the optimal by a factor that approaches 1 with rate cubic-root of Cmin, where Cmin is the smallest inventory of a product. We provide computational experiments based on simulated data for dynamic assortment management, as well as real parking transaction data for the city of Seattle. Our computational experiments demonstrate that the practical performance of our policies is substantially better than their performance guarantees and that performing rollout yields noticeable improvements.more » « less
-
Online pricing has been the focus of extensive research in recent years, particularly in the context of selling an item to sequentially arriving users. However, what if a provider wants to maximize revenue by selling multiple items to multiple users in each round? This presents a complex problem, as the provider must intelligently offer the items to those users who value them the most without exceeding their highest acceptable prices. In this study, we tackle this challenge by designing online algorithms that can efficiently offer and price items while learning user valuations from accept/reject feedback. We focus on three user valuation models (fixed valuations, random experiences, and random valuations) and provide algorithms with nearly-optimal revenue regret guarantees. In particular, for any market setting with N users, M items, and load L (which roughly corresponds to the maximum number of simultaneous allocations possible), our algorithms achieve regret of order O(NMloglog(LT)) under fixed valuations model, O(√NMLT) under random experiences model and O(√NMLT) under random valuations model in T rounds.more » « less
An official website of the United States government

