skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The Stopping Rule Principle and Confirmational Reliability
The stopping rule for a sequential experiment is the rule or procedure for determining when that experiment should end. Accordingly, the stopping rule principle (SRP) states that the evidential relationship between the final data from a sequential experiment and a hypothesis under consideration does not depend on the stopping rule: the same data should yield the same evidence, regardless of which stopping rule was used. I clarify and provide a novel defense of two interpretations of the main argument against the SRP, the foregone conclusion argument. According to the first, the SRP allows for highly confirmationally unreliable experiments, which concept I make precise, to confirm highly. According to the second, it entails the evidential equivalence of experiments differing significantly in their confirmational reliability. I rebut several attempts to deflate or deflect the foregone conclusion argument, drawing connections with replication in science and the likelihood principle.  more » « less
Award ID(s):
2042366
PAR ID:
10434013
Author(s) / Creator(s):
Date Published:
Journal Name:
Journal for General Philosophy of Science
ISSN:
0925-4560
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Scholkopf, Bernhard; Uhler, Caroline; Zhang, Kun (Ed.)
    In order to test if a treatment is perceptibly different from a placebo in a randomized experiment with covariates, classical nonparametric tests based on ranks of observations/residuals have been employed (eg: by Rosenbaum), with finite-sample valid inference enabled via permutations. This paper proposes a different principle on which to base inference: if — with access to all covariates and outcomes, but without access to any treatment assignments — one can form a ranking of the subjects that is sufficiently nonrandom (eg: mostly treated followed by mostly control), then we can confidently conclude that there must be a treatment effect. Based on a more nuanced, quantifiable, version of this principle, we design an interactive test called i-bet: the analyst forms a single permutation of the subjects one element at a time, and at each step the analyst bets toy money on whether that subject was actually treated or not, and learns the truth immediately after. The wealth process forms a real-valued measure of evidence against the global causal null, and we may reject the null at level if the wealth ever crosses 1= . Apart from providing a fresh “game-theoretic” principle on which to base the causal conclusion, the i-bet has other statistical and computational benefits, for example (A) allowing a human to adaptively design the test statistic based on increasing amounts of data being revealed (along with any working causal models and prior knowledge), and (B) not requiring permutation resampling, instead noting that under the null, the wealth forms a nonnegative martingale, and the type-1 error control of the aforementioned decision rule follows from a tight inequality by Ville. Further, if the null is not rejected, new subjects can later be added and the test can be simply continued, without any corrections (unlike with permutation p-values). Numerical experiments demonstrate good power under various heterogeneous treatment effects. We first describe i-bet test for two-sample comparisons with unpaired data, and then adapt it to paired data, multi-sample comparison, and sequential settings; these may be viewed as interactive martingale variants of the Wilcoxon, Kruskal-Wallis, and Friedman tests. 
    more » « less
  2. Forward selection (FS) is a popular variable selection method for linear regression. But theoretical understanding of FS with a diverging number of covariates is still limited. We derive sufficient conditions for FS to attain model selection consistency. Our conditions are similar to those for orthogonal matching pursuit, but are obtained using a different argument. When the true model size is unknown, we derive sufficient conditions for model selection consistency of FS with a data‐driven stopping rule, based on a sequential variant of cross‐validation. As a byproduct of our proofs, we also have a sharp (sufficient and almost necessary) condition for model selection consistency of “wrapper” forward search for linear regression. We illustrate intuition and demonstrate performance of our methods using simulation studies and real datasets. 
    more » « less
  3. null (Ed.)
    Purpose The experiment reported here compared two hypotheses for the poor statistical and artificial grammar learning often seen in children and adults with developmental language disorder (DLD; also known as specific language impairment). The procedural learning deficit hypothesis states that implicit learning of rule-based input is impaired, whereas the sequential pattern learning deficit hypothesis states that poor performance is only seen when learners must implicitly compute sequential dependencies. The current experiment tested learning of an artificial grammar that could be learned via feature activation, as observed in an associatively organized lexicon, without computing sequential dependencies and should therefore be learnable on the sequential pattern learning deficit hypothesis, but not on the procedural learning deficit hypothesis. Method Adults with DLD and adults with typical language development (TD) listened to consonant–vowel–consonant–vowel familiarization words from one of two artificial phonological grammars: Family Resemblance (two out of three features) and a control (exclusive OR, in which both consonants are voiced OR both consonants are voiceless) grammar in which no learning was predicted for either group. At test, all participants rated 32 test words as to whether or not they conformed to the pattern in the familiarization words. Results Adults with DLD and adults with TD showed equal and robust learning of the Family Resemblance grammar, accepting significantly more conforming than nonconforming test items. Both groups who were familiarized with the Family Resemblance grammar also outperformed those who were familiarized with the OR grammar, which, as predicted, was learned by neither group. Conclusion Although adults and children with DLD often underperform, compared to their peers with TD, on statistical and artificial grammar learning tasks, poor performance appears to be tied to the implicit computation of sequential dependencies, as predicted by the sequential pattern learning deficit hypothesis. 
    more » « less
  4. The problem of anomaly detection among multiple processes is considered within the framework of sequential design of experiments. The objective is an active inference strategy consisting of a selection rule governing which process to probe at each time, a stopping rule on when to terminate the detection, and a decision rule on the final detection outcome. The performance measure is the Bayes risk that takes into account not only sample complexity and detection errors, but also costs associated with switching across processes. While the problem is a partially observable Markov decision process to which optimal solutions are generally intractable, a low-complexity deterministic policy is shown to be asymptotically optimal and offer significant performance improvement over existing methods in the finite regime. 
    more » « less
  5. We show that “full-bang” control is optimal in a problem which combines features of (i) sequential least-squares estimation with Bayesian updating, for a random quantity observed in a bath of white noise; (ii) bounded control of the rate at which observations are received, with a superquadratic cost per unit time; and (iii) “fast” discretionary stopping. We develop also the optimal filtering and stopping rules in this context. 
    more » « less