Objectives: The causal impact method (CIM) was recently introduced for evaluation of binary interventions using observational time-series data. The CIM is appealing for practical use as it can adjust for temporal trends and account for the potential of unobserved confounding. However, the method was initially developed for applications involving large datasets and hence its potential in small epidemiological studies is still unclear. Further, the effects that measurement error can have on the performance of the CIM have not been studied yet. The objective of this work is to investigate both of these open problems. Methods: Motivated by an existing dataset of HCV surveillance in the UK, we perform simulation experiments to investigate the effect of several characteristics of the data on the performance of the CIM and extend the method to deal with this problem. Results: We identify multiple characteristics of the data that affect the ability of the CIM to detect an intervention effect including the length of time-series, the variability of the outcome and the degree of correlation between the outcome of the treated unit and the outcomes of controls. We show that measurement error can introduce biases in the estimated intervention effects and heavily reduce the power of the CIM. Using an extended CIM, some of these adverse effects can be mitigated. Conclusions: The CIM can provide satisfactory power in public health interventions. The method may provide misleading results in the presence of measurement error.
more »
« less
Optimal Design of Cluster- and Multisite-Randomized Studies Using Fallible Outcome Measures
Background:Evaluation studies frequently draw on fallible outcomes that contain significant measurement error. Ignoring outcome measurement error in the planning stages can undermine the sufficiency and efficiency of an otherwise well-designed study and can further constrain the evidence studies bring to bear on the effectiveness of programs. Objectives:We develop simple formulas to adjust statistical power, minimum detectable effect (MDE), and optimal sample allocation formulas for two-level cluster- and multisite-randomized designs when the outcome is subject to measurement error. Results:The resulting adjusted formulas suggest that outcome measurement error typically amplifies treatment effect uncertainty, reduces power, increases the MDE, and undermines the efficiency of conventional optimal sampling schemes. Therefore, achieving adequate power for a given effect size will typically demand increased sample sizes when considering fallible outcomes, while maintaining design efficiency will require increasing portions of a budget be applied toward sampling a larger number of individuals within clusters. We illustrate evaluation planning with the new formulas while comparing them to conventional formulas using hypothetical examples based on recent empirical studies. To encourage adoption of the new formulas, we implement them in the R package PowerUpR and in the PowerUp software.
more »
« less
- Award ID(s):
- 1760884
- PAR ID:
- 10547089
- Publisher / Repository:
- SAGE Publications
- Date Published:
- Journal Name:
- Evaluation Review
- Volume:
- 43
- Issue:
- 3-4
- ISSN:
- 0193-841X
- Format(s):
- Medium: X Size: p. 189-225
- Size(s):
- p. 189-225
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
ABSTRACT MotivationHere, we make available a second version of the BioTIME database, which compiles records of abundance estimates for species in sample events of ecological assemblages through time. The updated version expands version 1.0 of the database by doubling the number of studies and includes substantial additional curation to the taxonomic accuracy of the records, as well as the metadata. Moreover, we now provide an R package (BioTIMEr) to facilitate use of the database. Main Types of Variables IncludedThe database is composed of one main data table containing the abundance records and 11 metadata tables. The data are organised in a hierarchy of scales where 11,989,233 records are nested in 1,603,067 sample events, from 553,253 sampling locations, which are nested in 708 studies. A study is defined as a sampling methodology applied to an assemblage for a minimum of 2 years. Spatial Location and GrainSampling locations in BioTIME are distributed across the planet, including marine, terrestrial and freshwater realms. Spatial grain size and extent vary across studies depending on sampling methodology. We recommend gridding of sampling locations into areas of consistent size. Time Period and GrainThe earliest time series in BioTIME start in 1874, and the most recent records are from 2023. Temporal grain and duration vary across studies. We recommend doing sample‐level rarefaction to ensure consistent sampling effort through time before calculating any diversity metric. Major Taxa and Level of MeasurementThe database includes any eukaryotic taxa, with a combined total of 56,400 taxa. Software Formatcsv and. SQL.more » « less
-
Abstract PurposeMost commercially available treatment planning systems (TPSs) approximate the continuous delivery of volumetric modulated arc therapy (VMAT) plans with a series of discretized static beams for treatment planning, which can make VMAT dose computation extremely inefficient. In this study, we developed a polar‐coordinate‐based pencil beam (PB) algorithm for efficient VMAT dose computation with high‐resolution gantry angle sampling that can improve the computational efficiency and reduce the dose discrepancy due to the angular under‐sampling effect. Methods and Materials6 MV pencil beams were simulated on a uniform cylindrical phantom under an EGSnrc Monte Carlo (MC) environment. The MC‐generated PB kernels were collected in the polar coordinate system for each bixel on a fluence map and subsequently fitted via a series of Gaussians. The fluence was calculated using a detectors’ eye view with off‐axis and MLC transmission factors corrected. Doses of VMAT arc on the phantom were computed by summing the convolution results between the corresponding PB kernels and fluence for each bixel in the polar coordinate system. The convolution was performed using fast Fourier transform to expedite the computing speed. The calculated doses were converted to the Cartesian coordinate system and compared with the reference dose computed by a collapsed cone convolution (CCC) algorithm of the TPS. A heterogeneous phantom was created to study the heterogeneity corrections using the proposed algorithm. Ten VMAT arcs were included to evaluate the algorithm performance. Gamma analysis and computation complexity theory were used to measure the dosimetric accuracy and computational efficiency, respectively. ResultsThe dosimetric comparisons on the homogeneous phantom between the proposed PB algorithm and the CCC algorithm for 10 VMAT arcs demonstrate that the proposed algorithm can achieve a dosimetric accuracy comparable to that of the CCC algorithm with average gamma passing rates of 96% (2%/2mm) and 98% (3%/3mm). In addition, the proposed algorithm can provide better computational efficiency for VMAT dose computation using a PC equipped with a 4‐core processor, compared to the CCC algorithm utilizing a dual 10‐core server. Moreover, the computation complexity theory reveals that the proposed algorithm has a great advantage with regard to computational efficiency for VMAT dose computation on homogeneous medium, especially when a fine angular sampling rate is applied. This can support a reduction in dose errors from the angular under‐sampling effect by using a finer angular sampling rate, while still preserving a practical computing speed. For dose calculation on the heterogeneous phantom, the proposed algorithm with heterogeneity corrections can still offer a reasonable dosimetric accuracy with comparable computational efficiency to that of the CCC algorithm. ConclusionsWe proposed a novel polar‐coordinate‐based pencil beam algorithm for VMAT dose computation that enables a better computational efficiency while maintaining clinically acceptable dosimetric accuracy and reducing dose error caused by the angular under‐sampling effect. It also provides a flexible VMAT dose computation structure that allows adjustable sampling rates and direct dose computation in regions of interest, which makes the algorithm potentially useful for clinical applications such as independent dose verification for VMAT patient‐specific QA.more » « less
-
A number of information retrieval studies have been done to assess which statistical techniques are appropriate for comparing systems. However, these studies are focused on TREC-style experiments, which typically have fewer than 100 topics. There is no similar line of work for large search and recommendation experiments; such studies typically have thousands of topics or users and much sparser relevance judgements, so it is not clear if recommendations for analyzing traditional TREC experiments apply to these settings. In this paper, we empirically study the behavior of significance tests with large search and recommendation evaluation data. Our results show that the Wilcoxon and Sign tests show significantly higher Type-1 error rates for large sample sizes than the bootstrap, randomization and t-tests, which were more consistent with the expected error rate. While the statistical tests displayed differences in their power for smaller sample sizes, they showed no difference in their power for large sample sizes. We recommend the sign and Wilcoxon tests should not be used to analyze large scale evaluation results. Our result demonstrate that with Top-\(N\) recommendation and large search evaluation data, most tests would have a 100% chance of finding statistically significant results. Therefore, the effect size should be used to determine practical or scientific significance.more » « less
-
Abstract BackgroundThis study presents user evaluation studies to assess the effect of information rendered by an interventional planning software on the operator's ability to plan transrectal magnetic resonance (MR)‐guided prostate biopsies using actuated robotic manipulators. MethodsAn intervention planning software was developed based on the clinical workflow followed for MR‐guided transrectal prostate biopsies. The software was designed to interface with a generic virtual manipulator and simulate an intervention environment using 2D and 3D scenes. User studies were conducted with urologists using the developed software to plan virtual biopsies. ResultsUser studies demonstrated that urologists with prior experience in using 3D software completed the planning less time. 3D scenes were required to control all degrees‐of‐freedom of the manipulator, while 2D scenes were sufficient for planar motion of the manipulator. ConclusionsThe study provides insights on using 2D versus 3D environment from a urologist's perspective for different operational modes of MR‐guided prostate biopsy systems.more » « less