skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on January 1, 2026

Title: Beyond Increasing Sample Sizes: Optimizing Effect Sizes in Neuroimaging Research on Individual Differences
Linking neurobiology to relatively stable individual differences in cognition, emotion, motivation, and behavior can require large sample sizes to yield replicable results. Given the nature of between-person research, sample sizes at least in the hundreds are likely to be necessary in most neuroimaging studies of individual differences, regardless of whether they are investigating the whole brain or more focal hypotheses. However, the appropriate sample size depends on the expected effect size. Therefore, we propose four strategies to increase effect sizes in neuroimaging research, which may help to enable the detection of replicable between-person effects in samples in the hundreds rather than the thousands: (1) theoretical matching between neuroimaging tasks and behavioral constructs of interest; (2) increasing the reliability of both neural and psychological measurement; (3) individualization of measures for each participant; and (4) using multivariate approaches with cross-validation instead of univariate approaches. We discuss challenges associated with these methods and highlight strategies for improvements that will help the field to move toward a more robust and accessible neuroscience of individual differences.  more » « less
Award ID(s):
1920653
PAR ID:
10603998
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; ; ; « less
Publisher / Repository:
MIT Press
Date Published:
Journal Name:
Journal of Cognitive Neuroscience
Volume:
37
Issue:
6
ISSN:
0898-929X
Page Range / eLocation ID:
1023 to 1034
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Langille, Morgan (Ed.)
    The metagenome embedded in urban sewage is an attractive new data source to understand urban ecology and assess human health status at scales beyond a single host. Analyzing the viral fraction of wastewater in the ongoing COVID-19 pandemic has shown the potential of wastewater as aggregated samples for early detection, prevalence monitoring, and variant identification of human diseases in large populations. However, using census-based population size instead of real-time population estimates can mislead the interpretation of data acquired from sewage, hindering assessment of representativeness, inference of prevalence, or comparisons of taxa across sites. Here, we show that taxon abundance and sub-species diversisty in gut-associated microbiomes are new feature space to utilize for human population estimation. Using a population-scale human gut microbiome sample of over 1,100 people, we found that taxon-abundance distributions of gut-associated multi-person microbiomes exhibited generalizable relationships with respect to human population size. Here and throughout this paper, the human population size is essentially the sample size from the wastewater sample. We present a new algorithm, MicrobiomeCensus, for estimating human population size from sewage samples. MicrobiomeCensus harnesses the inter-individual variability in human gut microbiomes and performs maximum likelihood estimation based on simultaneous deviation of multiple taxa’s relative abundances from their population means. MicrobiomeCensus outperformed generic algorithms in data-driven simulation benchmarks and detected population size differences in field data. New theorems are provided to justify our approach. This research provides a mathematical framework for inferring population sizes in real time from sewage samples, paving the way for more accurate ecological and public health studies utilizing the sewage metagenome. 
    more » « less
  2. An understanding of human brain individuality requires the integration of data on brain organization across people and brain regions, molecular and systems scales, as well as healthy and clinical states. Here, we help advance this understanding by leveraging methods from computational genomics to integrate large-scale genomic, transcriptomic, neuroimaging, and electronic-health record data sets. We estimated genetically regulated gene expression (gr-expression) of 18,647 genes, across 10 cortical and subcortical regions of 45,549 people from the UK Biobank. First, we showed that patterns of estimated gr-expression reflect known genetic–ancestry relationships, regional identities, as well as inter-regional correlation structure of directly assayed gene expression. Second, we performed transcriptome-wide association studies (TWAS) to discover 1,065 associations between individual variation in gr-expression and gray-matter volumes across people and brain regions. We benchmarked these associations against results from genome-wide association studies (GWAS) of the same sample and found hundreds of novel associations relative to these GWAS. Third, we integrated our results with clinical associations of gr-expression from the Vanderbilt Biobank. This integration allowed us to link genes, via gr-expression, to neuroimaging and clinical phenotypes. Fourth, we identified associations of polygenic gr-expression with structural and functional MRI phenotypes in the Human Connectome Project (HCP), a small neuroimaging-genomic data set with high-quality functional imaging data. Finally, we showed that estimates of gr-expression and magnitudes of TWAS were generally replicable and that thep-values of TWAS were replicable in large samples. Collectively, our results provide a powerful new resource for integrating gr-expression with population genetics of brain organization and disease. 
    more » « less
  3. Meila, Marina; Zhang, Tong (Ed.)
    Stochastic Gradient Descent (SGD) is a popular tool in training large-scale machine learning models. Its performance, however, is highly variable, depending crucially on the choice of the step sizes. Accordingly, a variety of strategies for tuning the step sizes have been proposed, ranging from coordinate-wise approaches (a.k.a. “adaptive” step sizes) to sophisticated heuristics to change the step size in each iteration. In this paper, we study two step size schedules whose power has been repeatedly confirmed in practice: the exponential and the cosine step sizes. For the first time, we provide theoretical support for them proving convergence rates for smooth non-convex functions, with and without the Polyak-Łojasiewicz (PL) condition. Moreover, we show the surprising property that these two strategies are adaptive to the noise level in the stochastic gradients of PL functions. That is, contrary to polynomial step sizes, they achieve almost optimal performance without needing to know the noise level nor tuning their hyperparameters based on it. Finally, we conduct a fair and comprehensive empirical evaluation of real-world datasets with deep learning architectures. Results show that, even if only requiring at most two hyperparameters to tune, these two strategies best or match the performance of various finely-tuned state-of-the-art strategies. 
    more » « less
  4. While a recent upsurge in the application of neuroimaging methods to creative cognition has yielded encouraging progress toward understanding the neural underpinnings of creativity, the neural basis of barriers to creativity are as yet unexplored. Here, we report the first investigation into the neural correlates of one such recently identified barrier to creativity: anxiety specific to creative thinking, or creativity anxiety (Daker et al., 2019). Wee mployed a machine-learning technique for exploring relations between functional connectivity and behavior (connectome-based predictive modeling; CPM) to investigate the functional connections underlying creativity anxiety. Using whole-brain resting-state functional connectivity data, we identified a network of connections or “edges” that predicted individual differences in creativity anxiety, largely comprising connections within and between regions of the executive and default networks and the limbic system. We then found that the edges related to creativity anxiety identified in one sample generalize to predict creativity anxiety in an independent sample. We additionally found evidence that the network of edges related to creativity anxiety were largely distinct from those found in previous work to be related to divergent creative ability (Beaty et al., 2018). In addition to being the first work on the neural correlates of creativity anxiety, this research also included the development of a new Chinese-language version of the Creativity Anxiety Scale, and demonstrated that key behavioral findings from the initial work on creativity anxiety are replicable across cultures and languages. 
    more » « less
  5. Abstract Most research in the behavioral sciences aims to characterize effects of interest using sample means intended to describe the “typical” person. A difference in means is usually construed as a size difference in an effect common across subjects. However, mean effect size varies with bothwithin-subject effect sizeandpopulation prevalence(proportion of population showing the effect) in compared groups or across conditions. Few studies consider how prevalence affects mean effect size measurements and existing estimators of prevalence are, conversely, confounded by uncertainty about within-subject power. We introduce a widely applicable Bayesian method, thep-curve mixture model, that jointly estimates prevalence and effect size. Our approach outperforms existing prevalence estimation methods when within-subject power is uncertain and is sensitive to differences in prevalence or effect size across groups or experimental conditions. We present examples, extracting novel insights from existing datasets, and provide a user-facing software tool. 
    more » « less