skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Identification in simple binary outcome panel data models
Summary This paper first reviews some of the approaches that have been taken to estimate the common parameters of binary outcome models with fixed effects. We limit attention to situations in which the researcher has access to a data set with a large number of units (individuals or companies, for example) observed over a number of time periods. We then apply some of the existing approaches to study fixed-effects panel data versions of entry games, like the ones studied in Bresnahan and Reiss (1991) and Tamer (2003).  more » « less
Award ID(s):
1824131
PAR ID:
10300303
Author(s) / Creator(s):
;
Date Published:
Journal Name:
The Econometrics Journal
Volume:
24
Issue:
2
ISSN:
1368-4221
Page Range / eLocation ID:
C78 to C93
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Cosmic demographics—the statistical study of populations of astrophysical objects—has long relied on tools from multivariate statistics for analyzing data comprising fixed-length vectors of properties of objects, as might be compiled in a tabular astronomical catalog (say, with sky coordinates, and brightness measurements in a fixed number of spectral passbands). But beginning with the emergence of automated digital sky surveys, ca. 2000, astronomers began producing large collections of data with more complex structures: light curves (brightness time series) and spectra (brightness vs. wavelength). These comprise what statisticians call functional data—measurements of populations of functions. Upcoming automated sky surveys will soon provide astronomers with a flood of functional data. New methods are needed to accurately and optimally analyze large ensembles of light curves and spectra, accumulating information both along individual measured functions and across a population of such functions. Functional data analysis (FDA) provides tools for statistical modeling of functional data. Astronomical data presents several challenges for FDA methodology, e.g., sparse, irregular, and asynchronous sampling, and heteroscedastic measurement error. Bayesian FDA uses hierarchical Bayesian models for function populations, and is well suited to addressing these challenges. We provide an overview of astronomical functional data and some key Bayesian FDA modeling approaches, including functional mixed effects models, and stochastic process models. We briefly describe a Bayesian FDA framework combining FDA and machine learning methods to build low-dimensional parametric models for galaxy spectra. 
    more » « less
  2. null (Ed.)
    We consider the problem of learning predictive models from longitudinal data, consisting of irregularly repeated, sparse observations from a set of individuals over time. Such data often exhibit longitudinal correlation (LC) (correlations among observations for each individual over time), cluster correlation (CC) (correlations among individuals that have similar characteristics), or both. These correlations are often accounted for using mixed effects models that include fixed effects and random effects, where the fixed effects capture the regression parameters that are shared by all individuals, whereas random effects capture those parameters that vary across individuals. However, the current state-of-the-art methods are unable to select the most predictive fixed effects and random effects from a large number of variables, while accounting for complex correlation structure in the data and non-linear interactions among the variables. We propose Longitudinal Multi-Level Factorization Machine (LMLFM), to the best of our knowledge, the first model to address these challenges in learning predictive models from longitudinal data. We establish the convergence properties, and analyze the computational complexity, of LMLFM. We present results of experiments with both simulated and real-world longitudinal data which show that LMLFM outperforms the state-of-the-art methods in terms of predictive accuracy, variable selection ability, and scalability to data with large number of variables. The code and supplemental material is available at https://github.com/junjieliang672/LMLFM. 
    more » « less
  3. We consider the problem of learning predictive models from longitudinal data, consisting of irregularly repeated, sparse observations from a set of individuals over time. Such data of- ten exhibit longitudinal correlation (LC) (correlations among observations for each individual over time), cluster correlation (CC) (correlations among individuals that have similar char- acteristics), or both. These correlations are often accounted for using mixed effects models that include fixed effects and random effects, where the fixed effects capture the regression parameters that are shared by all individuals, whereas random effects capture those parameters that vary across individuals. However, the current state-of-the-art methods are unable to se- lect the most predictive fixed effects and random effects from a large number of variables, while accounting for complex cor- relation structure in the data and non-linear interactions among the variables. We propose Longitudinal Multi-Level Factoriza- tion Machine (LMLFM), to the best of our knowledge, the first model to address these challenges in learning predictive mod- els from longitudinal data. We establish the convergence prop- erties, and analyze the computational complexity, of LMLFM. We present results of experiments with both simulated and real-world longitudinal data which show that LMLFM out- performs the state-of-the-art methods in terms of predictive accuracy, variable selection ability, and scalability to data with large number of variables. The code and supplemental material is available at https://github.com/junjieliang672/LMLFM. 
    more » « less
  4. Abstract Cells with the same genome can exist in different phenotypes and can change between distinct phenotypes when subject to specific stimuli and microenvironments. Some examples include cell differentiation during development, reprogramming for induced pluripotent stem cells and transdifferentiation, cancer metastasis and fibrosis progression. The regulation and dynamics of cell phenotypic conversion is a fundamental problem in biology, and has a long history of being studied within the formalism of dynamical systems. A main challenge for mechanism-driven modeling studies is acquiring sufficient amount of quantitative information for constraining model parameters. Advances in quantitative experimental approaches, especially high throughput single-cell techniques, have accelerated the emergence of a new direction for reconstructing the governing dynamical equations of a cellular system from quantitative single-cell data, beyond the dominant statistical approaches. Here I review a selected number of recent studies using live- and fixed-cell data and provide my perspective on future development. 
    more » « less
  5. null (Ed.)
    A bstract The factor of four increase in the LHC luminosity, from 0 . 5 × 10 34 cm − 2 s − 1 to 2 . 0 × 10 34 cm − 2 s − 1 , and the corresponding increase in pile-up collisions during the 2015–2018 data-taking period, presented a challenge for the ATLAS trigger, particularly for those algorithms that select events with missing transverse momentum. The output data rate at fixed threshold typically increases exponentially with the number of pile-up collisions, so the legacy algorithms from previous LHC data-taking periods had to be tuned and new approaches developed to maintain the high trigger efficiency achieved in earlier operations. A study of the trigger performance and comparisons with simulations show that these changes resulted in event selection efficiencies of > 98% for this period, meeting and in some cases exceeding the performance of similar triggers in earlier run periods, while at the same time keeping the necessary bandwidth within acceptable limits. 
    more » « less