skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Selection Without Exclusion
It is well understood that classical sample selection models are not semiparametrically identified without exclusion restrictions. Lee (2009) developed bounds for the parameters in a model that nests the semiparametric sample selection model. These bounds can be wide. In this paper, we investigate bounds that impose the full structure of a sample selection model with errors that are independent of the explanatory variables but have unknown distribution. The additional structure can significantly reduce the identified set for the parameters of interest. Specifically, we construct the identified set for the parameter vector of interest. It is a one‐dimensional line segment in the parameter space, and we demonstrate that this line segment can be short in practice. We show that the identified set is sharp when the model is correct and empty when there exist no parameter values that make the sample selection model consistent with the data. We also provide non‐sharp bounds under the assumption that the model is correct. These are easier to compute and associated with lower statistical uncertainty than the sharp bounds. Throughout the paper, we illustrate our approach by estimating a standard sample selection model for wages.  more » « less
Award ID(s):
1824131
PAR ID:
10203528
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Econometrica
Volume:
88
Issue:
3
ISSN:
0012-9682
Page Range / eLocation ID:
1007 to 1029
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. An influential paper by Kleibergen (2005, Econometrica 73, 1103–1123) introduces Lagrange multiplier (LM) and conditional likelihood ratio-like (CLR) tests for nonlinear moment condition models. These procedures aim to have good size performance even when the parameters are unidentified or poorly identified. However, the asymptotic size and similarity (in a uniform sense) of these procedures have not been determined in the literature. This paper does so. This paper shows that the LM test has correct asymptotic size and is asymptotically similar for a suitably chosen parameter space of null distributions. It shows that the CLR tests also have these properties when the dimension p of the unknown parameter θ equals 1. When p ≥ 2, however, the asymptotic size properties are found to depend on how the conditioning statistic, upon which the CLR tests depend, is weighted. Two weighting methods have been suggested in the literature. The paper shows that the CLR tests are guaranteed to have correct asymptotic size when p ≥ 2 when the weighting is based on an estimator of the variance of the sample moments, i.e., moment-variance weighting, combined with the Robin and Smith (2000, Econometric Theory 16, 151–175) rank statistic. The paper also determines a formula for the asymptotic size of the CLR test when the weighting is based on an estimator of the variance of the sample Jacobian. However, the results of the paper do not guarantee correct asymptotic size when p ≥ 2 with the Jacobian-variance weighting, combined with the Robin and Smith (2000, Econometric Theory 16, 151–175) rank statistic, because two key sample quantities are not necessarily asymptotically independent under some identification scenarios. Analogous results for confidence sets are provided. Even for the special case of a linear instrumental variable regression model with two or more right-hand side endogenous variables, the results of the paper are new to the literature. 
    more » « less
  2. Storage systems usually have many parameters that affect their behavior. Tuning those parameters can provide significant gains in performance. Alas, both manual and automatic tuning methods struggle due to the large number of parameters and exponential number of possible configurations. Since previous research has shown that some parameters have greater performance impact than others, focusing on a smaller number of more important parameters can speed up auto-tuning systems because they would have a smaller state space to explore. In this paper, we propose Carver, which uses (1) a variance-based metric to quantify storage parameters’ importance, (2) Latin Hypercube Sampling to sample huge parameter spaces; and (3) a greedy but efficient parameter-selection algorithm that can identify important parameters. We evaluated Carver on datasets consisting of more than 500,000 experiments on 7 file systems, under 4 representative workloads. Carver successfully identified important parameters for all file systems and showed that importance varies with different workloads. We demonstrated that Carver was able to identify a near-optimal set of important parameters in our datasets. We showed Carver’s efficiency by testing it with a small fraction of our dataset; it was able to identify the same set of important parameters with as little as 0.4% of the whole dataset. 
    more » « less
  3. Storage systems usually have many parameters that affect their behavior. Tuning those parameters can provide significant gains in performance. Alas, both manual and automatic tuning methods struggle due to the large number of parameters and exponential number of possible configurations. Since previous research has shown that some parameters have greater performance impact than others, focusing on a smaller number of more important parameters can speed up auto-tuning systems because they would have a smaller state space to explore. In this paper, we propose Carver, which uses (1) a variance-based metric to quantify storage parameters’ importance, (2) Latin Hypercube Sampling to sample huge parameter spaces; and (3) a greedy but efficient parameter-selection algorithm that can identify important parameters. We evaluated Carver on datasets consisting of more than 500,000 experiments on 7 file systems, under 4 representative workloads. Carver successfully identified important parameters for all file systems and showed that importance varies with different workloads. We demonstrated that Carver was able to identify a near-optimal set of important parameters in our datasets. We showed Carver’s efficiency by testing it with a small fraction of our dataset; it was able to identify the same set of important parameters with as little as 0.4% of the whole dataset. 
    more » « less
  4. This paper examines subset selection for nonlinear least squares parameter estimation, and applies the methodology to a test system previously studied in the power system literature, involving the on-line identification of a synchronous generator model with many parameters. Subset selection partitions the parameters into well-conditioned and ill-conditioned subsets. We show for the test system that fixing the ill-conditioned parameters to prior estimates (even if these prior estimates are substantially in error), and estimating only the remaining parameters, significantly improves the performance of the estimation algorithm and greatly enhances the quality of the estimated parameters. It is shown that attempts to estimate all of the model parameters, as done in the original work with this test system, can yield extremely unreliable results. 
    more » « less
  5. Finley, Stacey (Ed.)
    Since the seminal 1961 paper of Monod and Jacob, mathematical models of biomolecular circuits have guided our understanding of cell regulation. Model-based exploration of the functional capabilities of any given circuit requires systematic mapping of multidimensional spaces of model parameters. Despite significant advances in computational dynamical systems approaches, this analysis remains a nontrivial task. Here, we use a nonlinear system of ordinary differential equations to model oocyte selection in Drosophila , a robust symmetry-breaking event that relies on autoregulatory localization of oocyte-specification factors. By applying an algorithmic approach that implements symbolic computation and topological methods, we enumerate all phase portraits of stable steady states in the limit when nonlinear regulatory interactions become discrete switches. Leveraging this initial exact partitioning and further using numerical exploration, we locate parameter regions that are dense in purely asymmetric steady states when the nonlinearities are not infinitely sharp, enabling systematic identification of parameter regions that correspond to robust oocyte selection. This framework can be generalized to map the full parameter spaces in a broad class of models involving biological switches. 
    more » « less