Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
We consider the periodic review dynamic pricing and inventory control problem with fixed ordering cost. Demand is random and price dependent, and unsatisfied demand is backlogged. With complete demand information, the celebrated [Formula: see text] policy is proved to be optimal, where s and S are the reorder point and order-up-to level for ordering strategy, and [Formula: see text], a function of on-hand inventory level, characterizes the pricing strategy. In this paper, we consider incomplete demand information and develop online learning algorithms whose average profit approaches that of the optimal [Formula: see text] with a tight [Formula: see text] regret rate. A number of salient features differentiate our work from the existing online learning researches in the operations management (OM) literature. First, computing the optimal [Formula: see text] policy requires solving a dynamic programming (DP) over multiple periods involving unknown quantities, which is different from the majority of learning problems in OM that only require solving single-period optimization questions. It is hence challenging to establish stability results through DP recursions, which we accomplish by proving uniform convergence of the profit-to-go function. The necessity of analyzing action-dependent state transition over multiple periods resembles the reinforcement learning question, considerably more difficult thanmore »
-
Zooming in on cells reveals patterns on their outer surfaces. These patterns are actually a collection of distinct areas of the cell surface, each containing specific combinations of molecules. The outer layers of pollen grains consist of a cell wall, and a softer cell membrane that sits underneath. As a pollen grain develops, it recruits certain fats and proteins to specific areas of the cell membrane, known as ‘aperture domains’. The composition of these domains blocks the cell wall from forming over them, leading to gaps in the wall called ‘pollen apertures’. Pollen apertures can open and close, aiding reproduction and protecting pollen grains from dehydration. The number, location, and shape of pollen apertures vary between different plant species, but are consistent within the same species. In the plant species Arabidopsis thaliana , pollen normally develops three long and narrow, equally spaced apertures, but it remains unclear how pollen grains control the number and location of aperture domains. Zhou et al. found that mutations in two closely related A. thaliana proteins – ELMOD_A and MCR – alter the number and positions of pollen apertures. When A. thaliana plants were genetically modified so that they would produce different levels of ELMOD_Amore »
-
We study the dynamic assortment planning problem, where for each arriving customer, the seller offers an assortment of substitutable products and the customer makes the purchase among offered products according to an uncapacitated multinomial logit (MNL) model. Because all the utility parameters of the MNL model are unknown, the seller needs to simultaneously learn customers’ choice behavior and make dynamic decisions on assortments based on the current knowledge. The goal of the seller is to maximize the expected revenue, or, equivalently, to minimize the expected regret. Although dynamic assortment planning problem has received an increasing attention in revenue management, most existing policies require the estimation of mean utility for each product and the final regret usually involves the number of products [Formula: see text]. The optimal regret of the dynamic assortment planning problem under the most basic and popular choice model—the MNL model—is still open. By carefully analyzing a revenue potential function, we develop a trisection-based policy combined with adaptive confidence bound construction, which achieves an item-independent regret bound of [Formula: see text], where [Formula: see text] is the length of selling horizon. We further establish the matching lower bound result to show the optimality of our policy. There aremore »