Abstract Capsicum chinense (habanero pepper) exhibits substantial variation in fruit pungency, color, and flavor due to its rich secondary metabolite composition, including capsaicinoids, carotenoids, and volatile organic compounds (VOCs). To dissect the genetic and regulatory basis of these traits, we conducted an integrative analysis across 244 diverse accessions using metabolite profiling, genome-wide association studies (GWAS), and transcriptome-wide association studies (TWAS). GWAS identified 507 SNPs for capsaicinoids, 304 for carotenoids, and 1176 for VOCs, while TWAS linked gene expression to metabolite levels, highlighting biosynthetic and regulatory genes in phenylpropanoid, fatty acid, and terpenoid pathways. Segmental RNA sequencing across fruit tissues of contrasting accessions revealed 7034 differentially expressed genes, including MYB31, 3-ketoacyl-CoA synthase, phytoene synthase, and ABC transporters. Notably, AP2 transcription factors and Pentatrichopeptide repeat (PPR) emerged as central regulators, co-expressed with carotenoid and VOC biosynthetic genes. High-resolution spatial transcriptomics (Stereo-seq) identified 74 genes with tissue-specific expression that overlap with GWAS and TWAS loci, reinforcing their regulatory relevance. To validate these candidates, we employed CRISPR/Cas9 to knock out AP2 and PPR genes in tomato. Widely targeted metabolomics and carotenoid profiling revealed major metabolic shifts: AP2 mutants accumulated higher levels of β-carotene and lycopene. In contrast, PPR mutants altered xanthophyll ester and apocarotenoid levels, supporting their roles in carotenoid flux and remodeling. This study provides the first integrative GWAS–TWAS–spatial transcriptomics in C. chinense, revealing key regulators of fruit quality traits. These findings lay the groundwork for precision breeding and metabolic engineering to enhance nutritional and sensory attributes in peppers. 
                        more » 
                        « less   
                    
                            
                            Integration of estimated regional gene expression with neuroimaging and clinical phenotypes at biobank scale
                        
                    
    
            An understanding of human brain individuality requires the integration of data on brain organization across people and brain regions, molecular and systems scales, as well as healthy and clinical states. Here, we help advance this understanding by leveraging methods from computational genomics to integrate large-scale genomic, transcriptomic, neuroimaging, and electronic-health record data sets. We estimated genetically regulated gene expression (gr-expression) of 18,647 genes, across 10 cortical and subcortical regions of 45,549 people from the UK Biobank. First, we showed that patterns of estimated gr-expression reflect known genetic–ancestry relationships, regional identities, as well as inter-regional correlation structure of directly assayed gene expression. Second, we performed transcriptome-wide association studies (TWAS) to discover 1,065 associations between individual variation in gr-expression and gray-matter volumes across people and brain regions. We benchmarked these associations against results from genome-wide association studies (GWAS) of the same sample and found hundreds of novel associations relative to these GWAS. Third, we integrated our results with clinical associations of gr-expression from the Vanderbilt Biobank. This integration allowed us to link genes, via gr-expression, to neuroimaging and clinical phenotypes. Fourth, we identified associations of polygenic gr-expression with structural and functional MRI phenotypes in the Human Connectome Project (HCP), a small neuroimaging-genomic data set with high-quality functional imaging data. Finally, we showed that estimates of gr-expression and magnitudes of TWAS were generally replicable and that thep-values of TWAS were replicable in large samples. Collectively, our results provide a powerful new resource for integrating gr-expression with population genetics of brain organization and disease. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2207891
- PAR ID:
- 10552409
- Publisher / Repository:
- PLOS
- Date Published:
- Journal Name:
- PLOS Biology
- Volume:
- 22
- Issue:
- 9
- ISSN:
- 1545-7885
- Page Range / eLocation ID:
- e3002782
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            BackgroundGenome‐wide association studies (GWASs) have identified thousands of genetic variants that are associated with many complex traits. However, their biological mechanisms remain largely unknown. Transcriptome‐wide association studies (TWAS) have been recently proposed as an invaluable tool for investigating the potential gene regulatory mechanisms underlying variant‐trait associations. Specifically, TWAS integrate GWAS with expression mapping studies based on a common set of variants and aim to identify genes whose GReX is associated with the phenotype. Various methods have been developed for performing TWAS and/or similar integrative analysis. Each such method has a different modeling assumption and many were initially developed to answer different biological questions. Consequently, it is not straightforward to understand their modeling property from a theoretical perspective. ResultsWe present a technical review on thirteen TWAS methods. Importantly, we show that these methods can all be viewed as two‐sample Mendelian randomization (MR) analysis, which has been widely applied in GWASs for examining the causal effects of exposure on outcome. Viewing different TWAS methods from an MR perspective provides us a unique angle for understanding their benefits and pitfalls. We systematically introduce the MR analysis framework, explain how features of the GWAS and expression data influence the adaptation of MR for TWAS, and re‐interpret the modeling assumptions made in different TWAS methods from an MR angle. We finally describe future directions for TWAS methodology development. ConclusionsWe hope that this review would serve as a useful reference for both methodologists who develop TWAS methods and practitioners who perform TWAS analysis.more » « less
- 
            ABSTRACT Genome-wide association studies (GWAS) can identify genetic variants responsible for naturally occurring and quantitative phenotypic variation. Association studies therefore provide a powerful complement to approaches that rely on de novo mutations for characterizing gene function. Although bacteria should be amenable to GWAS, few GWAS have been conducted on bacteria, and the extent to which nonindependence among genomic variants (e.g., linkage disequilibrium [LD]) and the genetic architecture of phenotypic traits will affect GWAS performance is unclear. We apply association analyses to identify candidate genes underlying variation in 20 biochemical, growth, and symbiotic phenotypes among 153 strains of Ensifer meliloti . For 11 traits, we find genotype-phenotype associations that are stronger than expected by chance, with the candidates in relatively small linkage groups, indicating that LD does not preclude resolving association candidates to relatively small genomic regions. The significant candidates show an enrichment for nucleotide polymorphisms (SNPs) over gene presence-absence variation (PAV), and for five traits, candidates are enriched in large linkage groups, a possible signature of epistasis. Many of the variants most strongly associated with symbiosis phenotypes were in genes previously identified as being involved in nitrogen fixation or nodulation. For other traits, apparently strong associations were not stronger than the range of associations detected in permuted data. In sum, our data show that GWAS in bacteria may be a powerful tool for characterizing genetic architecture and identifying genes responsible for phenotypic variation. However, careful evaluation of candidates is necessary to avoid false signals of association. IMPORTANCE Genome-wide association analyses are a powerful approach for identifying gene function. These analyses are becoming commonplace in studies of humans, domesticated animals, and crop plants but have rarely been conducted in bacteria. We applied association analyses to 20 traits measured in Ensifer meliloti , an agriculturally and ecologically important bacterium because it fixes nitrogen when in symbiosis with leguminous plants. We identified candidate alleles and gene presence-absence variants underlying variation in symbiosis traits, antibiotic resistance, and use of various carbon sources; some of these candidates are in genes previously known to affect these traits whereas others were in genes that have not been well characterized. Our results point to the potential power of association analyses in bacteria, but also to the need to carefully evaluate the potential for false associations.more » « less
- 
            null (Ed.)The ability to produce novel ideas is central to societal progress and innovation; however, little is known about the biological basis of creativity. Here, we investigate the organization of brain networks that support creativity by combining functional neuroimaging data with gene expression information. Given the multifaceted nature of creative thinking, we hypothesized that distributed connectivity would not only be related to individual differences in creative ability, but also delineate the cortical distributions of genes involved in synaptic plasticity. We defined neuroimaging phenotypes using a graph theory approach that detects local and distributed network circuits, then characterized the spatial associations between functional connectivity and cortical gene expression distributions. Our findings reveal strong spatial correlations between connectivity maps and sets of genes devoted to synaptic assembly and signaling. This connectomic-transcriptome approach thus identifies gene expression profiles associated with high creative ability, linking cognitive flexibility to neural plasticity in the human brain.more » « less
- 
            Abstract A multistage variable selection method is introduced for detecting association signals in structured brain‐wide and genome‐wide association studies (brain‐GWAS). Compared to conventional methods that link one voxel to one single nucleotide polymorphism (SNP), our approach is more efficient and powerful in selecting the important signals by integrating anatomic and gene grouping structures in the brain and the genome, respectively. It avoids resorting to a large number of multiple comparisons while effectively controlling the false discoveries. Validity of the proposed approach is demonstrated by both theoretical investigation and numerical simulations. We apply our proposed method to a brain‐GWAS using Alzheimer's Disease Neuroimaging Initiative positron emission tomography (ADNI PET) imaging and genomic data. We confirm previously reported association signals and also uncover several novel SNPs and genes that are either associated with brain glucose metabolism or have their association significantly modified by Alzheimer's disease status.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    