Random parameter logit models address unobserved preference heterogeneity in discrete choice analysis. The latent class logit model assumes a discrete heterogeneity distribution, by combining a conditional logit model of economic choices with a multinomial logit (MNL) for stochastic assignment to classes. Whereas point estimation of latent class logit models is widely applied in practice, stochastic assignment of individuals to classes needs further analysis. In this paper we analyze the statistical behavior of six competing class assignment strategies, namely: maximum prior MNL probabilities, class drawn from prior MNL probabilities, maximum posterior assignment, drawn posterior assignment, conditional individual-specific estimates, and conditional individual estimates combined with the Krinsky–Robb method to account for uncertainty. Using both a Monte Carlo study and two empirical case studies, we show that assigning individuals to classes based on maximum MNL probabilities behaves better than randomly drawn classes in market share predictions. However, randomly drawn classes have higher accuracy in predicted class shares. Finally, class assignment based on individual-level conditional estimates that account for the sampling distribution of the assignment parameters shows superior behavior for a larger number of choice occasions per individual.
more »
« less
Nonseparable multinomial choice models in cross-section and panel data
Multinomial choice models are fundamental for empirical modeling of economic choices among discrete alternatives. We analyze identification of binary and multinomial choice models when the choice utilities are nonseparable in observed attributes and multidimensional unobserved heterogeneity with cross-section and panel data. We show that derivatives of choice probabilities with respect to continuous attributes are weighted averages of utility derivatives in cross-section models with exogenous heterogeneity. In the special case of random coefficient models with an independent additive effect, we further characterize that the probability derivative at zero is proportional to the population mean of the coefficients. We extend the identification results to models with endogenous heterogeneity using either a control function or panel data. In time stationary panel models with two periods, we find that differences over time of derivatives of choice probabilities identify utility derivatives “on the diagonal,” i.e. when the observed attributes take the same values in the two periods. We also show that time stationarity does not identify structural derivatives “off the diagonal” both in continuous and multinomial choice panel models.
more »
« less
- Award ID(s):
- 1757140
- PAR ID:
- 10472340
- Publisher / Repository:
- Journal of Econometrics
- Date Published:
- Journal Name:
- Journal of Econometrics
- Volume:
- 211
- Issue:
- 1
- ISSN:
- 0304-4076
- Page Range / eLocation ID:
- 104 to 116
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This article surveys the development of nonparametric models and methods for estimation of choice models with nonlinear budget sets. The discussion focuses on the budget set regression, that is, the conditional expectation of a choice variable given the budget set. Utility maximization in a nonparametric model with general heterogeneity reduces the curse of dimensionality in this regression. Empirical results using this regression are different from maximum likelihood and give informative inference. The article also considers the information provided by kink probabilities for nonparametric utility with general heterogeneity. Instrumental variable estimation and the evidence it provides of heterogeneity in preferences are also discussed.more » « less
-
We develop a variant of the multinomial logit model with impatient customers and study assortment optimization and pricing problems under this choice model. In our choice model, a customer incrementally views the assortment of available products in multiple stages. The patience level of a customer determines the maximum number of stages in which the customer is willing to view the assortments of products. In each stage, if the product with the largest utility provides larger utility than a minimum acceptable utility, which we refer to as the utility of the outside option, then the customer purchases that product right away. Otherwise, the customer views the assortment of products in the next stage as long as the customer’s patience level allows the customer to do so. Under the assumption that the utilities have the Gumbel distribution and are independent, we give a closed-form expression for the choice probabilities. For the assortment-optimization problem, we develop a polynomial-time algorithm to find the revenue-maximizing sequence of assortments to offer. For the pricing problem, we show that, if the sequence of offered assortments is fixed, then we can solve a convex program to find the revenue-maximizing prices, with which the decision variables are the probabilities that a customer reaches different stages. We build on this result to give a 0.878-approximation algorithm when both the sequence of assortments and the prices are decision variables. We consider the assortment-optimization problem when each product occupies some space and there is a constraint on the total space consumption of the offered products. We give a fully polynomial-time approximation scheme for this constrained problem. We use a data set from Expedia to demonstrate that incorporating patience levels, as in our model, can improve purchase predictions. We also check the practical performance of our approximation schemes in terms of both the quality of solutions and the computation times.more » « less
-
Abstract Customer preference modelling has been widely used to aid engineering design decisions on the selection and configuration of design attributes. Recently, network analysis approaches, such as the exponential random graph model (ERGM), have been increasingly used in this field. While the ERGM-based approach has the new capability of modelling the effects of interactions and interdependencies (e.g., social relationships among customers) on customers’ decisions via network structures (e.g., using triangles to model peer influence), existing research can only model customers’ consideration decisions, and it cannot predict individual customer’s choices, as what the traditional utility-based discrete choice models (DCMs) do. However, the ability to make choice predictions is essential to predicting market demand, which forms the basis of decision-based design (DBD). This paper fills this gap by developing a novel ERGM-based approach for choice prediction. This is the first time that a network-based model can explicitly compute the probability of an alternative being chosen from a choice set. Using a large-scale customer-revealed choice database, this research studies the customer preferences estimated from the ERGM-based choice models with and without network structures and evaluates their predictive performance of market demand, benchmarking the multinomial logit (MNL) model, a traditional DCM. The results show that the proposed ERGM-based choice modelling achieves higher accuracy in predicting both individual choice behaviours and market share ranking than the MNL model, which is mathematically equivalent to ERGM when no network structures are included. The insights obtained from this study further extend the DBD framework by allowing explicit modelling of interactions among entities (i.e., customers and products) using network representations.more » « less
-
This paper establishes central limit theorems (CLTs) and proposes how to perform valid inference in factor models. We consider a setting where many counties/regions/assets are observed for many time periods, and when estimation of a global parameter includes aggregation of a cross-section of heterogeneous microparameters estimated separately for each entity. The CLT applies for quantities involving both cross-sectional and time series aggregation, as well as for quadratic forms in time-aggregated errors. This paper studies the conditions when one can consistently estimate the asymptotic variance, and proposes a bootstrap scheme for cases when one cannot. A small simulation study illustrates performance of the asymptotic and bootstrap procedures. The results are useful for making inferences in two-step estimation procedures related to factor models, as well as in other related contexts. Our treatment avoids structural modeling of cross-sectional dependence but imposes time-series independence.more » « less