skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM ET on Friday, February 6 until 10:00 AM ET on Saturday, February 7 due to maintenance. We apologize for the inconvenience.


Title: A polynomial-time algorithm for learning nonparametric causal graphs
We establish finite-sample guarantees for a polynomial-time algorithm for learning a nonlinear, nonparametric directed acyclic graphical (DAG) model from data. The analysis is model-free and does not assume linearity, additivity, independent noise, or faithfulness. Instead, we impose a condition on the residual variances that is closely related to previous work on linear models with equal variances. Compared to an optimal algorithm with oracle knowledge of the variable ordering, the additional cost of the algorithm is linear in the dimension d and the number of samples n. Finally, we compare the proposed algorithm to existing approaches in a simulation study.  more » « less
Award ID(s):
1823032
PAR ID:
10298046
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper presents finite‐sample efficiency bounds for the core econometric problem of estimation of linear regression coefficients. We show that the classical Gauss–Markov theorem can be restated omitting the unnatural restriction to linear estimators, without adding any extra conditions. Our results are lower bounds on the variances of unbiased estimators. These lower bounds correspond to the variances of the the least squares estimator and the generalized least squares estimator, depending on the assumption on the error covariances. These results show that we can drop the label “linear estimator” from the pedagogy of the Gauss–Markov theorem. Instead of referring to these estimators as BLUE, they can legitimately be called BUE (best unbiased estimators). 
    more » « less
  2. This research develops, compares, and analyzes both a traditional algorithm using computer vision and a deep learning model to deal with dynamic road conditions. In the final testing, the deep learning model completed the target of five laps for both the inner and outer lane, whereas the computer vision algorithm only completed almost three laps for the inner lane and slightly over four laps for the outer. After conducting statistical analysis on the results of our deep learning model by finding the p-value between the absolute error and squared error of the self-driving algorithm in the outer lane and inner lane, we find that our results are statistically significant based on a two-tailed T test with unequal variances where the p-value for absolute error is 0.009, and 0.001 for squared error. Self-driving vehicles are not only complex, but they are growing in necessity—therefore, finding an optimal solution for lane detection in dynamic conditions is crucial to continue innovation. 
    more » « less
  3. In this paper, we study the problem of optimal data collection for policy evaluation in linear bandits. In policy evaluation, we are given a \textit{target} policy and asked to estimate the expected reward it will obtain when executed in a multi-armed bandit environment. Our work is the first work that focuses on such an optimal data collection strategy for policy evaluation involving heteroscedastic reward noise in the linear bandit setting. We first formulate an optimal design for weighted least squares estimates in the heteroscedastic linear bandit setting with the knowledge of noise variances. This design minimizes the mean squared error (MSE) of the estimated value of the target policy and is termed the oracle design. Since the noise variance is typically unknown, we then introduce a novel algorithm, SPEED (\textbf{S}tructured \textbf{P}olicy \textbf{E}valuation \textbf{E}xperimental \textbf{D}esign), that tracks the oracle design and derive its regret with respect to the oracle design. We show that regret scales as 𝑂˜(𝑑3𝑛−3/2) and prove a matching lower bound of Ω(𝑑2𝑛−3/2) . Finally, we evaluate SPEED on a set of policy evaluation tasks and demonstrate that it achieves MSE comparable to an optimal oracle and much lower than simply running the target policy. 
    more » « less
  4. We propose a Bayesian model selection approach for generalized linear mixed models (GLMMs). We consider covariance structures for the random effects that are widely used in areas such as longitudinal studies, genome-wide association studies, and spatial statistics. Since the random effects cannot be integrated out of GLMMs analytically, we approximate the integrated likelihood function using a pseudo likelihood approach. Our Bayesian approach assumes a flat prior for the fixed effects and includes both approximate reference prior and half-Cauchy prior choices for the variances of random effects. Since the flat prior on the fixed effects is improper, we develop a fractional Bayes factor approach to obtain posterior probabilities of the several competing models. Simulation studies with Poisson generalized linear mixed models with spatial random effects and overdispersion random effects show that our approach performs favorably when compared to widely used competing Bayesian methods including DIC and WAIC. We illustrate the usefulness and flexibility of our approach with three case studies including a Poisson longitudinal model, a Poisson spatial model, and a logistic mixed model. Our proposed approach is implemented in the R package GLMMselect that is available on CRAN. 
    more » « less
  5. The focus of this paper is on the global sensitivity analysis (GSA) of linear systems with time-invariant model parameter uncertainties and driven by stochastic inputs. The Sobol' indices of the evolving mean and variance estimates of states are used to assess the impact of the time-invariant uncertain model parameters and the statistics of the stochastic input on the uncertainty of the output. Numerical results on two benchmark problems help illustrate that it is conceivable that parameters, which are not so significant in contributing to the uncertainty of the mean, can be extremely significant in contributing to the uncertainty of the variances. The paper uses a polynomial chaos (PC) approach to synthesize a surrogate probabilistic model of the stochastic system after using Lagrange interpolation polynomials (LIPs) as PC bases. The Sobol' indices are then directly evaluated from the PC coefficients. Although this concept is not new, a novel interpretation of stochastic collocation-based PC and intrusive PC is presented where they are shown to represent identical probabilistic models when the system under consideration is linear. This result now permits treating linear models as black boxes to develop intrusive PC surrogates. 
    more » « less