skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, May 17 until 8:00 AM ET on Saturday, May 18 due to maintenance. We apologize for the inconvenience.

Search for: All records

Award ID contains: 1646108

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    Randomized controlled trials (RCTs) admit unconfounded design-based inference – randomization largely justifies the assumptions underlying statistical effect estimates – but often have limited sample sizes. However, researchers may have access to big observational data on covariates and outcomes from RCT nonparticipants. For example, data from A/B tests conducted within an educational technology platform exist alongside historical observational data drawn from student logs. We outline a design-based approach to using such observational data for variance reduction in RCTs. First, we use the observational data to train a machine learning algorithm predicting potential outcomes using covariates and then use that algorithm to generate predictions for RCT participants. Then, we use those predictions, perhaps alongside other covariates, to adjust causal effect estimates with a flexible, design-based covariate-adjustment routine. In this way, there is no danger of biases from the observational data leaking into the experimental estimates, which are guaranteed to be exactly unbiased regardless of whether the machine learning models are “correct” in any sense or whether the observational samples closely resemble RCT samples. We demonstrate the method in analyzing 33 randomized A/B tests and show that it decreases standard errors relative to other estimators, sometimes substantially.

    more » « less
  2. Abstract

    Convergent research identifies a general factor (“P factor”) that confers transdiagnostic risk for psychopathology. Large-scale networks are key organizational units of the human brain. However, studies of altered network connectivity patterns associated with the P factor are limited, especially in early adolescence when most mental disorders are first emerging. We studied 11,875 9- and 10-year olds from the Adolescent Brain and Cognitive Development (ABCD) study, of whom 6593 had high-quality resting-state scans. Network contingency analysis was used to identify altered interconnections associated with the P factor among 16 large-scale networks. These connectivity changes were then further characterized with quadrant analysis that quantified the directionality of P factor effects in relation to neurotypical patterns of positive versus negative connectivity across connections. The results showed that the P factor was associated with altered connectivity across 28 network cells (i.e., sets of connections linking pairs of networks);pPERMUTATIONvalues < 0.05 FDR-corrected for multiple comparisons. Higher P factor scores were associated with hypoconnectivity within default network and hyperconnectivity between default network and multiple control networks. Among connections within these 28 significant cells, the P factor was predominantly associated with “attenuating” effects (67%;pPERMUTATION < 0.0002), i.e., reduced connectivity at neurotypically positive connections and increased connectivity at neurotypically negative connections. These results demonstrate that the general factor of psychopathology produces attenuating changes across multiple networks including default network, involved in spontaneous responses, and control networks involved in cognitive control. Moreover, they clarify mechanisms of transdiagnostic risk for psychopathology and invite further research into developmental causes of distributed attenuated connectivity.

    more » « less
  3. Free, publicly-accessible full text available January 1, 2025
  4. Free, publicly-accessible full text available October 2, 2024
  5. Functional connections in the brain are frequently represented by weighted networks, with nodes representing locations in the brain and edges representing the strength of connectivity between these locations. One challenge in analyzing such data is that inference at the individual edge level is not particularly biologically meaningful; interpretation is more useful at the level of so-called functional systems or groups of nodes and connections between them; this is often called “graph-aware” inference in the neuroimaging literature. However, pooling over functional regions leads to significant loss of information and lower accuracy. Another challenge is correlation among edge weights within a subject which makes inference based on independence assumptions unreliable. We address both of these challenges with a linear mixed effects model, which accounts for functional systems and for edge dependence, while still modeling individual edge weights to avoid loss of information. The model allows for comparing two populations, such as patients and healthy controls, both at the functional regions level and at individual edge level, leading to biologically meaningful interpretations. We fit this model to resting state fMRI data on schizophrenic patients and healthy controls, obtaining interpretable results consistent with the schizophrenia literature. 
    more » « less
    Free, publicly-accessible full text available September 1, 2024
  6. Free, publicly-accessible full text available July 3, 2024
  7. Free, publicly-accessible full text available July 1, 2024