Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

Assortment optimization with choice model estimation and learning has been studied extensively in the datadriven revenue management literature. Existing methods and analysis, however, do not take into consideration the fact that some customers arriving at certain time periods might exhibit outlier purchasing behaviors. The work of Chen et al. studies dynamic assortment optimization in the presence of outlier customers modeled by an epscontamination model. The impact of outlier customers on the revenue performance of an algorithm is analyzed and discussed.more » « lessFree, publiclyaccessible full text available August 21, 2024

Free, publiclyaccessible full text available October 1, 2024

Abstract Ideological divisions in the United States have become increasingly prominent in daily communication. Accordingly, there has been much research on political polarization, including many recent efforts that take a computational perspective. By detecting political biases in a text document, one can attempt to discern and describe its polarity. Intuitively, the named entities (i.e., the nouns and the phrases that act as nouns) and hashtags in text often carry information about political views. For example, people who use the term “prochoice” are likely to be liberal and people who use the term “prolife” are likely to be conservative. In this paper, we seek to reveal political polarities in socialmedia text data and to quantify these polarities by explicitly assigning a polarity score to entities and hashtags. Although this idea is straightforward, it is difficult to perform such inference in a trustworthy quantitative way. Key challenges include the small number of known labels, the continuous spectrum of political views, and the preservation of both a polarity score and a polarityneutral semantic meaning in an embedding vector of words. To attempt to overcome these challenges, we propose the
P olarityawareE mbeddingM ultitask learning (PEM ) model. This model consists of (1) a selfsupervised contextpreservation task, (2) an attentionbased tweetlevel polarityinference task, and (3) an adversarial learning task that promotes independence between an embedding’s polarity component and its semantic component. Our experimental results demonstrate that ourPEM model can successfully learn polarityaware embeddings that perform well at tweetlevel and accountlevel classification tasks. We examine a variety of applications—including a study of spatial and temporal distributions of polarities and a comparison between tweets from Twitter and posts from Parler—and we thereby demonstrate the effectiveness of ourPEM model. We also discuss important limitations of our work and encourage caution when applying thePEM model to realworld scenarios. 
Pricebased revenue management is an important problem in operations management with many practical applications. The problem considers a seller who sells one or multiple products over T consecutive periods and is subject to constraints on the initial inventory levels of resources. Whereas, in theory, the optimal pricing policy could be obtained via dynamic programming, computing the exact dynamic programming solution is often intractable. Approximate policies, such as the resolving heuristics, are often applied as computationally tractable alternatives. In this paper, we show the following two results for pricebased network revenue management under a continuous price set. First, we prove that a natural resolving heuristic attains O(1) regret compared with the value of the optimal policy. This improves the [Formula: see text] regret upper bound established in the prior work by Jasin in 2014. Second, we prove that there is an [Formula: see text] gap between the value of the optimal policy and that of the fluid model. This complements our upper bound result by showing that the fluid is not an adequate informationrelaxed benchmark when analyzing pricebased revenue management algorithms. Funding: This work was supported in part by the National Science Foundation [Grant CMMI2145661].more » « less

In recent decades, the advance of information technology and abundant personal data facilitate the application of algorithmic personalized pricing. However, this leads to the growing concern of potential violation of privacy because of adversarial attack. To address the privacy issue, this paper studies a dynamic personalized pricing problem with unknown nonparametric demand models under data privacy protection. Two concepts of data privacy, which have been widely applied in practices, are introduced: central differential privacy (CDP) and local differential privacy (LDP), which is proved to be stronger than CDP in many cases. We develop two algorithms that make pricing decisions and learn the unknown demand on the fly while satisfying the CDP and LDP guarantee, respectively. In particular, for the algorithm with CDP guarantee, the regret is proved to be at most [Formula: see text]. Here, the parameter T denotes the length of the time horizon, d is the dimension of the personalized information vector, and the key parameter [Formula: see text] measures the strength of privacy (smaller ε indicates a stronger privacy protection). Conversely, for the algorithm with LDP guarantee, its regret is proved to be at most [Formula: see text], which is near optimal as we prove a lower bound of [Formula: see text] for any algorithm with LDP guarantee.more » « less

In this paper, we study the learning problem in contextual search, which is motivated by applications such as crowdsourcing and personalized medicine experiments. In particular, for a sequence of arriving context vectors, with each context associated with an underlying value, the decision maker either makes a query at a certain point or skips the context. The decision maker will only observe the binary feedback on the relationship between the query point and the value associated with the context. We study a probably approximately correct learning setting, where the goal is to learn the underlying mean value function in context with a minimum number of queries. To address this challenge, we propose a trisection search approach combined with a marginbased active learning method. We show that the algorithm only needs to make [Formula: see text] queries to achieve an εestimation accuracy. This sample complexity significantly reduces the required sample complexity in the passive setting where neither sample skipping nor query selection is allowed, which is at least [Formula: see text]. This paper was accepted by J. George Shanthikumar, data science.more » « less

This paper studies a dynamic pricing problem under model misspecification. To characterize model misspecification, we adopt the εcontamination model—the most fundamental model in robust statistics and machine learning. In particular, for a selling horizon of length T, the online εcontamination model assumes that demands are realized according to a typical unknown demand function only for [Formula: see text] periods. For the rest of [Formula: see text] periods, an outlier purchase can happen with arbitrary demand functions. The challenges brought by the presence of outlier customers are mainly due to the fact that arrivals of outliers and their exhibited demand behaviors are completely arbitrary, therefore calling for robust estimation and exploration strategies that can handle any outlier arrival and demand patterns. We first consider unconstrained dynamic pricing without any inventory constraint. In this case, we adopt the FollowtheRegularizedLeader algorithm to hedge against outlier purchase behavior. Then, we introduce inventory constraints. When the inventory is insufficient, we study a robust bisectionsearch algorithm to identify the clearance price—that is, the price at which the initial inventory is expected to clear at the end of T periods. Finally, we study the general dynamic pricing case, where a retailer has no clue whether the inventory is sufficient or not. In this case, we design a metaalgorithm that combines the previous two policies. All algorithms are fully adaptive, without requiring prior knowledge of the outlier proportion parameter ε. Simulation study shows that our policy outperforms existing policies in the literature.more » « less

We consider the periodic review dynamic pricing and inventory control problem with fixed ordering cost. Demand is random and price dependent, and unsatisfied demand is backlogged. With complete demand information, the celebrated [Formula: see text] policy is proved to be optimal, where s and S are the reorder point and orderupto level for ordering strategy, and [Formula: see text], a function of onhand inventory level, characterizes the pricing strategy. In this paper, we consider incomplete demand information and develop online learning algorithms whose average profit approaches that of the optimal [Formula: see text] with a tight [Formula: see text] regret rate. A number of salient features differentiate our work from the existing online learning researches in the operations management (OM) literature. First, computing the optimal [Formula: see text] policy requires solving a dynamic programming (DP) over multiple periods involving unknown quantities, which is different from the majority of learning problems in OM that only require solving singleperiod optimization questions. It is hence challenging to establish stability results through DP recursions, which we accomplish by proving uniform convergence of the profittogo function. The necessity of analyzing actiondependent state transition over multiple periods resembles the reinforcement learning question, considerably more difficult than existing bandit learning algorithms. Second, the pricing function [Formula: see text] is of infinite dimension, and approaching it is much more challenging than approaching a finite number of parameters as seen in existing researches. The demandprice relationship is estimated based on upper confidence bound, but the confidence interval cannot be explicitly calculated due to the complexity of the DP recursion. Finally, because of the multiperiod nature of [Formula: see text] policies the actual distribution of the randomness in demand plays an important role in determining the optimal pricing strategy [Formula: see text], which is unknown to the learner a priori. In this paper, the demand randomness is approximated by an empirical distribution constructed using dependent samples, and a novel Wasserstein metricbased argument is employed to prove convergence of the empirical distribution. This paper was accepted by J. George Shanthikumar, big data analytics.more » « less

null (Ed.)The prevalence of ecommerce has made customers’ detailed personal information readily accessible to retailers, and this information has been widely used in pricing decisions. When using personalized information, the question of how to protect the privacy of such information becomes a critical issue in practice. In this paper, we consider a dynamic pricing problem over T time periods with an unknown demand function of posted price and personalized information. At each time t, the retailer observes an arriving customer’s personal information and offers a price. The customer then makes the purchase decision, which will be utilized by the retailer to learn the underlying demand function. There is potentially a serious privacy concern during this process: a thirdparty agent might infer the personalized information and purchase decisions from price changes in the pricing system. Using the fundamental framework of differential privacy from computer science, we develop a privacypreserving dynamic pricing policy, which tries to maximize the retailer revenue while avoiding information leakage of individual customer’s information and purchasing decisions. To this end, we first introduce a notion of anticipating [Formula: see text]differential privacy that is tailored to the dynamic pricing problem. Our policy achieves both the privacy guarantee and the performance guarantee in terms of regret. Roughly speaking, for ddimensional personalized information, our algorithm achieves the expected regret at the order of [Formula: see text] when the customers’ information is adversarially chosen. For stochastic personalized information, the regret bound can be further improved to [Formula: see text]. This paper was accepted by J. George Shanthikumar, big data analytics.more » « less