skip to main content


Title: Optimal Policy for Dynamic Assortment Planning Under Multinomial Logit Models
We study the dynamic assortment planning problem, where for each arriving customer, the seller offers an assortment of substitutable products and the customer makes the purchase among offered products according to an uncapacitated multinomial logit (MNL) model. Because all the utility parameters of the MNL model are unknown, the seller needs to simultaneously learn customers’ choice behavior and make dynamic decisions on assortments based on the current knowledge. The goal of the seller is to maximize the expected revenue, or, equivalently, to minimize the expected regret. Although dynamic assortment planning problem has received an increasing attention in revenue management, most existing policies require the estimation of mean utility for each product and the final regret usually involves the number of products [Formula: see text]. The optimal regret of the dynamic assortment planning problem under the most basic and popular choice model—the MNL model—is still open. By carefully analyzing a revenue potential function, we develop a trisection-based policy combined with adaptive confidence bound construction, which achieves an item-independent regret bound of [Formula: see text], where [Formula: see text] is the length of selling horizon. We further establish the matching lower bound result to show the optimality of our policy. There are two major advantages of the proposed policy. First, the regret of all our policies has no dependence on [Formula: see text]. Second, our policies are almost assumption-free: there is no assumption on mean utility nor any “separability” condition on the expected revenues for different assortments. We also extend our trisection search algorithm to capacitated MNL models and obtain the optimal regret [Formula: see text] (up to logrithmic factors) without any assumption on the mean utility parameters of items.  more » « less
Award ID(s):
1845444
NSF-PAR ID:
10301868
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Mathematics of Operations Research
ISSN:
0364-765X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Price-based revenue management is an important problem in operations management with many practical applications. The problem considers a seller who sells one or multiple products over T consecutive periods and is subject to constraints on the initial inventory levels of resources. Whereas, in theory, the optimal pricing policy could be obtained via dynamic programming, computing the exact dynamic programming solution is often intractable. Approximate policies, such as the resolving heuristics, are often applied as computationally tractable alternatives. In this paper, we show the following two results for price-based network revenue management under a continuous price set. First, we prove that a natural resolving heuristic attains O(1) regret compared with the value of the optimal policy. This improves the [Formula: see text] regret upper bound established in the prior work by Jasin in 2014. Second, we prove that there is an [Formula: see text] gap between the value of the optimal policy and that of the fluid model. This complements our upper bound result by showing that the fluid is not an adequate information-relaxed benchmark when analyzing price-based revenue management algorithms. Funding: This work was supported in part by the National Science Foundation [Grant CMMI-2145661]. 
    more » « less
  2. Problem definition: We consider network revenue management problems with flexible products. We have a network of resources with limited capacities. To each customer arriving into the system, we offer an assortment of products. The customer chooses a product within the offered assortment or decides to leave without a purchase. The products are flexible in the sense that there are multiple possible combinations of resources that we can use to serve a customer with a purchase for a particular product. We refer to each such combination of resources as a route. The service provider chooses the route to serve a customer with a purchase for a particular product. Such flexible products occur, for example, when customers book at-home cleaning services but leave the timing of service to the company that provides the service. Our goal is to find a policy to decide which assortment of products to offer to each customer to maximize the total expected revenue, making sure that there are always feasible route assignments for the customers with purchased products. Methodology/results: We start by considering the case in which we make the route assignments at the end of the selling horizon. The dynamic programming formulation of the problem is significantly different from its analogue without flexible products as the state variable keeps track of the number of purchases for each product rather than the remaining capacity of each resource. Letting L be the maximum number of resources in a route, we give a policy that obtains at least [Formula: see text] fraction of the optimal total expected revenue. We extend our policy to the case in which we make the route assignments periodically over the selling horizon. Managerial implications: To our knowledge, the policy that we develop is the first with a performance guarantee under flexible products. Thus, our work constructs policies that can be implemented in practice under flexible products, also providing performance guarantees. Funding: The work of H. Topaloglu was partly funded by the National Science Foundation [Grant CMMI-1825406]. Supplemental Material: The online appendix is available at https://doi.org/10.1287/msom.2022.0583 . 
    more » « less
  3. We develop a variant of the multinomial logit model with impatient customers and study assortment optimization and pricing problems under this choice model. In our choice model, a customer incrementally views the assortment of available products in multiple stages. The patience level of a customer determines the maximum number of stages in which the customer is willing to view the assortments of products. In each stage, if the product with the largest utility provides larger utility than a minimum acceptable utility, which we refer to as the utility of the outside option, then the customer purchases that product right away. Otherwise, the customer views the assortment of products in the next stage as long as the customer’s patience level allows the customer to do so. Under the assumption that the utilities have the Gumbel distribution and are independent, we give a closed-form expression for the choice probabilities. For the assortment-optimization problem, we develop a polynomial-time algorithm to find the revenue-maximizing sequence of assortments to offer. For the pricing problem, we show that, if the sequence of offered assortments is fixed, then we can solve a convex program to find the revenue-maximizing prices, with which the decision variables are the probabilities that a customer reaches different stages. We build on this result to give a 0.878-approximation algorithm when both the sequence of assortments and the prices are decision variables. We consider the assortment-optimization problem when each product occupies some space and there is a constraint on the total space consumption of the offered products. We give a fully polynomial-time approximation scheme for this constrained problem. We use a data set from Expedia to demonstrate that incorporating patience levels, as in our model, can improve purchase predictions. We also check the practical performance of our approximation schemes in terms of both the quality of solutions and the computation times. 
    more » « less
  4. null (Ed.)
    The prevalence of e-commerce has made customers’ detailed personal information readily accessible to retailers, and this information has been widely used in pricing decisions. When using personalized information, the question of how to protect the privacy of such information becomes a critical issue in practice. In this paper, we consider a dynamic pricing problem over T time periods with an unknown demand function of posted price and personalized information. At each time t, the retailer observes an arriving customer’s personal information and offers a price. The customer then makes the purchase decision, which will be utilized by the retailer to learn the underlying demand function. There is potentially a serious privacy concern during this process: a third-party agent might infer the personalized information and purchase decisions from price changes in the pricing system. Using the fundamental framework of differential privacy from computer science, we develop a privacy-preserving dynamic pricing policy, which tries to maximize the retailer revenue while avoiding information leakage of individual customer’s information and purchasing decisions. To this end, we first introduce a notion of anticipating [Formula: see text]-differential privacy that is tailored to the dynamic pricing problem. Our policy achieves both the privacy guarantee and the performance guarantee in terms of regret. Roughly speaking, for d-dimensional personalized information, our algorithm achieves the expected regret at the order of [Formula: see text] when the customers’ information is adversarially chosen. For stochastic personalized information, the regret bound can be further improved to [Formula: see text]. This paper was accepted by J. George Shanthikumar, big data analytics. 
    more » « less
  5. We examine the revenue–utility assortment optimization problem with the goal of finding an assortment that maximizes a linear combination of the expected revenue of the firm and the expected utility of the customer. This criterion captures the trade-off between the firm-centric objective of maximizing the expected revenue and the customer-centric objective of maximizing the expected utility. The customers choose according to the multinomial logit model, and there is a constraint on the offered assortments characterized by a totally unimodular matrix. We show that we can solve the revenue–utility assortment optimization problem by finding the assortment that maximizes only the expected revenue after adjusting the revenue of each product by the same constant. Finding the appropriate revenue adjustment requires solving a nonconvex optimization problem. We give a parametric linear program to generate a collection of candidate assortments that is guaranteed to include an optimal solution to the revenue–utility assortment optimization problem. This collection of candidate assortments also allows us to construct an efficient frontier that shows the optimal expected revenue–utility pairs as we vary the weights in the objective function. Moreover, we develop an approximation scheme that limits the number of candidate assortments while ensuring a prespecified solution quality. Finally, we discuss practical assortment optimization problems that involve totally unimodular constraints. In our computational experiments, we demonstrate that we can obtain significant improvements in the expected utility without incurring a significant loss in the expected revenue. This paper was accepted by Omar Besbes, revenue management and market analytics. 
    more » « less