skip to main content


Title: Monte Carlo goodness-of-fit tests for degree corrected and related stochastic blockmodels
Abstract

We construct Bayesian and frequentist finite-sample goodness-of-fit tests for three different variants of the stochastic blockmodel for network data. Since all of the stochastic blockmodel variants are log-linear in form when block assignments are known, the tests for the latent block model versions combine a block membership estimator with the algebraic statistics machinery for testing goodness-of-fit in log-linear models. We describe Markov bases and marginal polytopes of the variants of the stochastic blockmodel and discuss how both facilitate the development of goodness-of-fit tests and understanding of model behaviour. The general testing methodology developed here extends to any finite mixture of log-linear models on discrete data, and as such is the first application of the algebraic statistics machinery for latent-variable models.

 
more » « less
Award ID(s):
1947919
NSF-PAR ID:
10463118
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Journal of the Royal Statistical Society Series B: Statistical Methodology
Volume:
86
Issue:
1
ISSN:
1369-7412
Format(s):
Medium: X Size: p. 90-121
Size(s):
["p. 90-121"]
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    When testing hypotheses about which of two competing models is better, say A and B, the difference is often not significant. An alternative, complementary approach, is to measure how often model A is better than model B regardless of how slight or large the difference. The hypothesis concerns whether or not the percentage of time that model A is better than model B is larger than 50%. One generalized test statistic that can be used is the power-divergence test, which encompasses many familiar goodness-of-fit test statistics, such as the loglikelihood-ratio and PearsonX2tests. Theoretical results justify using thedistribution for the entire family of test statistics, wherekis the number of categories. However, these results assume that the underlying data are independent and identically distributed, which is often violated. Empirical results demonstrate that the reduction to two categories (i.e., model A is better than model B versus model B is better than A) results in a test that is reasonably robust to even severe departures from temporal independence, as well as contemporaneous correlation. The test is demonstrated on two different example verification sets: 6-h forecasts of eddy dissipation rate (m2/3s−1) from two versions of the Graphical Turbulence Guidance model and for 12-h forecasts of 2-m temperature (°C) and 10-m wind speed (m s−1) from two versions of the High-Resolution Rapid Refresh model. The novelty of this paper is in demonstrating the utility of the power-divergence statistic in the face of temporally dependent data, as well as the emphasis on testing for the “frequency-of-better” alongside more traditional measures.

     
    more » « less
  2. Abstract

    Performance of classifiers is often measured in terms of average accuracy on test data. Despite being a standard measure, average accuracy fails in characterising the fit of the model to the underlying conditional law of labels given the features vector (Y∣X), e.g. due to model misspecification, over fitting, and high-dimensionality. In this paper, we consider the fundamental problem of assessing the goodness-of-fit for a general binary classifier. Our framework does not make any parametric assumption on the conditional law Y∣X and treats that as a black-box oracle model which can be accessed only through queries. We formulate the goodness-of-fit assessment problem as a tolerance hypothesis testing of the form H0:E[Df(Bern(η(X))‖Bern(η^(X)))]≤τ where Df represents an f-divergence function, and η(x), η^(x), respectively, denote the true and an estimate likelihood for a feature vector x admitting a positive label. We propose a novel test, called Goodness-of-fit with Randomisation and Scoring Procedure (GRASP) for testing H0, which works in finite sample settings, no matter the features (distribution-free). We also propose model-X GRASP designed for model-X settings where the joint distribution of the features vector is known. Model-X GRASP uses this distributional information to achieve better power. We evaluate the performance of our tests through extensive numerical experiments.

     
    more » « less
  3. Summary

    In many biomedical studies, we are interested in comparing treatment effects with an inherent ordering. We propose a quadratic score test (QST) based on a quadratic inference function for detecting an order in treatment effects for correlated data. The quadratic inference function is similar to the negative of a log-likelihood, and it provides test statistics in the spirit of a χ2-test for testing nested hypotheses as well as for assessing the goodness of fit of model assumptions. Under the null hypothesis of no order restriction, it is shown that the QST statistic has a Wald-type asymptotic representation and that the asymptotic distribution of the QST statistic is a weighted χ2-distribution. Furthermore, an asymptotic distribution of the QST statistic under an arbitrary convex cone alternative is provided. The performance of the QST is investigated through Monte Carlo simulation experiments. Analysis of the polyposis data demonstrates that the QST outperforms the Wald test when data are highly correlated with a small sample size and there is a significant amount of missing data with a small number of clusters. The proposed test statistic accommodates both time-dependent and time-independent covariates in a model.

     
    more » « less
  4. Summary

    We consider functional measurement error models, i.e. models where covariates are measured with error and yet no distributional assumptions are made about the mismeasured variable. We propose and study a score-type local test and an orthogonal series-based, omnibus goodness-of-fit test in this context, where no likelihood function is available or calculated—i.e. all the tests are proposed in the semiparametric model framework. We demonstrate that our tests have optimality properties and computational advantages that are similar to those of the classical score tests in the parametric model framework. The test procedures are applicable to several semiparametric extensions of measurement error models, including when the measurement error distribution is estimated non-parametrically as well as for generalized partially linear models. The performance of the local score-type and omnibus goodness-of-fit tests is demonstrated through simulation studies and analysis of a nutrition data set.

     
    more » « less
  5. Abstract

    Genome‐wide association studies (GWAS) have led to rapid growth in detecting genetic variants associated with various phenotypes. Owing to a great number of publicly accessible GWAS summary statistics, and the difficulty in obtaining individual‐level genotype data, many existing gene‐based association tests have been adapted to require only GWAS summary statistics rather than individual‐level data. However, these association tests are restricted to unrelated individuals and thus do not apply to family samples directly. Moreover, due to its flexibility and effectiveness, the linear mixed model has been increasingly utilized in GWAS to handle correlated data, such as family samples. However, it remains unknown how to perform gene‐based association tests in family samples using the GWAS summary statistics estimated from the linear mixed model. In this study, we show that, when family size is negligible compared to the total sample size, the diagonal block structure of the kinship matrix makes it possible to approximate the correlation matrix of marginalZscores by linkage disequilibrium matrix. Based on this result, current methods utilizing summary statistics for unrelated individuals can be directly applied to family data without any modifications. Our simulation results demonstrate that this proposed strategy controls the type 1 error rate well in various situations. Finally, we exemplify the usefulness of the proposed approach with a dental caries GWAS data set.

     
    more » « less