In addition to scientific questions, clinical trialists often explore or require other design features, such as increasing the power while controlling the type I error rate, minimizing unnecessary exposure to inferior treatments, and comparing multiple treatments in one clinical trial. We propose implementing adaptive seamless design (ASD) with response adaptive randomization (RAR) to satisfy various clinical trials’ design objectives. However, the combination of ASD and RAR poses a challenge in controlling the type I error rate. In this paper, we investigated how to utilize the advantages of the two adaptive methods and control the type I error rate. We offered the theoretical foundation for this procedure. Numerical studies demonstrated that our methods could achieve efficient and ethical objectives while controlling the type I error rate. 
                        more » 
                        « less   
                    
                            
                            Causal estimators for incorporating external controls in randomized trials with longitudinal outcomes
                        
                    
    
            Abstract Incorporating external data, such as external controls, holds the promise of improving the efficiency of traditional randomized controlled trials especially when treating rare diseases or diseases with unmet needs. To this end, we propose novel weighting estimators grounded in the causal inference framework. As an alternative framework, Bayesian methods are also discussed. From trial design perspective, operating characteristics including Type I error and power are particularly important and are assessed in our realistic simulation studies representing a variety of practical scenarios. Our proposed weighting estimators achieve significant power gain, while maintaining Type I error close to the nominal value of 0.05. An empirical application of the methods is demonstrated through a Phase III clinical trial in rare disease. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1934568
- PAR ID:
- 10555963
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Journal of the Royal Statistical Society Series A: Statistics in Society
- ISSN:
- 0964-1998
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Differential item functioning (DIF) is often used to examine validity evidence of alternate form test accommodations. Unfortunately, traditional approaches for evaluating DIF are prone to selection bias. This article proposes a novel DIF framework that capitalizes on regression discontinuity design analysis to control for selection bias. A simulation study was performed to compare the new framework with traditional logistic regression, with respect to Type I error and power rates of the uniform DIF test statistics and bias and root mean square error of the corresponding effect size estimators. The new framework better controlled the Type I error rate and demonstrated minimal bias but suffered from low power and lack of precision. Implications for practice are discussed.more » « less
- 
            Abstract Genome-wide Association Studies (GWAS) methods have identified individual single-nucleotide polymorphisms (SNPs) significantly associated with specific phenotypes. Nonetheless, many complex diseases are polygenic and are controlled by multiple genetic variants that are usually non-linearly dependent. These genetic variants are marginally less effective and remain undetected in GWAS analysis. Kernel-based tests (KBT), which evaluate the joint effect of a group of genetic variants, are therefore critical for complex disease analysis. However, choosing different kernel functions in KBT can significantly influence the type I error control and power, and selecting the optimal kernel remains a statistically challenging task. A few existing methods suffer from inflated type 1 errors, limited scalability, inferior power or issues of ambiguous conclusions. Here, we present a new Bayesian framework, BayesKAT (https://github.com/wangjr03/BayesKAT), which overcomes these kernel specification issues by selecting the optimal composite kernel adaptively from the data while testing genetic associations simultaneously. Furthermore, BayesKAT implements a scalable computational strategy to boost its applicability, especially for high-dimensional cases where other methods become less effective. Based on a series of performance comparisons using both simulated and real large-scale genetics data, BayesKAT outperforms the available methods in detecting complex group-level associations and controlling type I errors simultaneously. Applied on a variety of groups of functionally related genetic variants based on biological pathways, co-expression gene modules and protein complexes, BayesKAT deciphers the complex genetic basis and provides mechanistic insights into human diseases.more » « less
- 
            Abstract Over the last three decades, many growth and yield systems developed for the southeast USA have incorporated methods to create a compatible basal area (BA) prediction and projection equation. This technique allows practitioners to calibrate BA models using both measurements at a given arbitrary age, as well as the increment in BA when time series panel data are available. As a result, model parameters for either prediction or projection alternatives are compatible. One caveat of this methodology is that pairs of observations used to project forward have the same weight as observations from a single measurement age, regardless of the projection time interval. To address this problem, we introduce a variance–covariance structure giving different weights to predictions with variable intervals. To test this approach, prediction and projection equations were fitted simultaneously using an ad hoc matrix structure. We tested three different error structures in fitting models with (i) homoscedastic errors described by a single parameter (Method 1); (ii) heteroscedastic errors described with a weighting factor $${w}_t$$ (Method 2); and (iii) errors including both prediction ($$\overset{\smile }{\varepsilon }$$) and projection errors ($$\tilde{\varepsilon}$$) in the weighting factor $${w}_t$$ (Method 3). A rotation-age dataset covering nine sites, each including four blocks with four silvicultural treatments per block, was used for model calibration and validation, including explicit terms for each treatment. Fitting using an error structure which incorporated the combined error term ($$\overset{\smile }{\varepsilon }$$ and $$\tilde{\varepsilon}$$) into the weighting factor $${w}_t$$ (Method 3), generated better results according to the root mean square error with respect to the other two methods evaluated. Also, the system of equations that incorporated silvicultural treatments as dummy variables generated lower root mean square error (RMSE) and Akaike’s index values (AIC) in all methods. Our results show a substantial improvement over the current prediction-projection approach, resulting in consistent estimators for BA.more » « less
- 
            Conformal prediction is a flexible framework for calibrating machine learning predictions, providing distribution-free statistical guarantees. In outlier detection, this calibration relies on a reference set of labeled inlier data to control the type-I error rate. However, obtaining a perfectly labeled inlier reference set is often unrealistic, and a more practical scenario involves access to a contaminated reference set containing a small fraction of outliers. This paper analyzes the impact of such contamination on the validity of conformal methods. We prove that under realistic, non-adversarial settings, calibration on contaminated data yields conservative type-I error control, shedding light on the inherent robustness of conformal methods. This conservativeness, however, typically results in a loss of power. To alleviate this limitation, we propose a novel, active data-cleaning framework that leverages a limited labeling budget and an outlier detection model to selectively annotate data points in the contaminated reference set that are suspected as outliers. By removing only the annotated outliers in this ``suspicious'' subset, we can effectively enhance power while mitigating the risk of inflating the type-I error rate, as supported by our theoretical analysis. Experiments on real datasets validate the conservative behavior of conformal methods under contamination and show that the proposed data-cleaning strategy improves power without sacrificing validity.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    