Abstract Correlation among multiple phenotypes across related individuals may reflect some pattern of shared genetic architecture: individual genetic loci affect multiple phenotypes (an effect known as pleiotropy), creating observable relationships between phenotypes. A natural hypothesis is that pleiotropic effects reflect a relatively small set of common “core” cellular processes: each genetic locus affects one or a few core processes, and these core processes in turn determine the observed phenotypes. Here, we propose a method to infer such structure in genotype–phenotype data. Our approach, sparse structure discovery (SSD) is based on a penalized matrix decomposition designed to identify latent structure that is low-dimensional (many fewer core processes than phenotypes and genetic loci), locus-sparse (each locus affects few core processes), and/or phenotype-sparse (each phenotype is influenced by few core processes). Our use of sparsity as a guide in the matrix decomposition is motivated by the results of a novel empirical test indicating evidence of sparse structure in several recent genotype–phenotype datasets. First, we use synthetic data to show that our SSD approach can accurately recover core processes if each genetic locus affects few core processes or if each phenotype is affected by few core processes. Next, we apply the method to three datasets spanning adaptive mutations in yeast, genotoxin robustness assay in human cell lines, and genetic loci identified from a yeast cross, and evaluate the biological plausibility of the core process identified. More generally, we propose sparsity as a guiding prior for resolving latent structure in empirical genotype–phenotype maps.
more »
« less
Deep Video Analysis for Bacteria Genotype Prediction
Abstract Genetic modification of microbes is central to many biotechnology fields, such as industrial microbiology, bioproduction, and drug discovery. Understanding how specific genetic modifications influence observable bacterial behaviors is crucial for advancing these fields. In this study, we propose a supervised model to classify bacteria harboring single gene modifications to draw connections between phenotype and genotype. In particular, we demonstrate that the spatiotemporal patterns ofVibrio choleraegrowth, recorded in terms of low-resolution bright-field microscopy videos, are highly predictive of the genotype class. Additionally, we introduce a weakly supervised approach to identify key moments in culture growth that significantly contribute to prediction accuracy. By focusing on the temporal expressions of bacterial behavior, our findings offer valuable insights into the underlying mechanisms and developmental stages by which specific genes control observable phenotypes. This research opens new avenues for automating the analysis of phenotypes, with potential applications for drug discovery, disease management, etc. Furthermore, this work highlights the potential of using machine learning techniques to explore the functional roles of specific genes using a low-resolution light microscope.
more »
« less
- PAR ID:
- 10612160
- Publisher / Repository:
- bioRxiv
- Date Published:
- Format(s):
- Medium: X
- Institution:
- bioRxiv
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract BackgroundGenetic and epigenetic perturbation of cis-regulatory sequences can shift patterns of gene expression and result in novel phenotypes. Phased genome assemblies now enable the local dissection of linkages between cis-regulatory sequences, including their epigenetic state, and allele-specific gene expression to further characterize gene regulation and resulting phenotypes in heterozygous genomes. ResultsWe assembled a locally phased genome for a mandarin hybrid named ‘Fairchild’ to explore the molecular signatures of allele-specific gene expression. With local genome phasing, genes with allele-specific expression were paired with haplotype-specific chromatin states, including levels of chromatin accessibility, histone modifications, and DNA methylation. We found that 30% of variation in allele-specific expression could be attributed to haplotype associated factors, with allelic levels of chromatin accessibility and three histone modifications in gene bodies having the most influence. Structural variants in promoter regions were also associated with allele-specific expression, including specific enrichments of hAT and MULE-MuDR DNA transposon sequences. Integration of haplotype-resolved genetic and epigenetic landscapes with high-throughput phenotypic analysis of fruit traits in a panel of 154 accessions with mandarin and pummelo ancestry revealed that trait-associated variants were enriched in regions of open chromatin. Mining of trait-associated variants uncovered a Gypsy retrotransposon insertion in a gene that regulates potassium transport and may contribute to the reduction in fruit size that is observed in mandarins. ConclusionsUsing a locally phased assembly of a heterozygous cultivar of citrus, we dissected the interplay between genetic variants and molecular phenotypes to reveal cis-regulatory sequences with potential functional effects on phenotypes relevant for genetic improvement.more » « less
-
Whiteson, Katrine (Ed.)ABSTRACT The opportunistic human pathogenPseudomonas aeruginosais naturally infected by a large class of temperate, transposable, Mu-like phages. We examined the genotypic and phenotypic diversity ofP. aeruginosaPA14 lysogen populations as they resolve clustered regularly interspaced short palindromic repeat(CRISPR) autoimmunity, mediated by an imperfect CRISPR match to the Mu-like DMS3 prophage. After 12 days of evolution, we measured a decrease in spontaneous induction in both exponential and stationary phase growth. Co-existing variation in spontaneous induction rates in the exponential phase depended on the way the coexisting strains resolved genetic conflict. Multiple mutational modes to resolve genetic conflict between host and phage resulted in coexistence in evolved populations of single lysogens that maintained CRISPR immunity to other phages and polylysogens that lost immunity completely. This work highlights a new dimension of the role of lysogenic phages in the evolution of their hosts.IMPORTANCEThe chronic opportunistic multi-drug-resistant pathogenPseudomonas aeruginosais persistently infected by temperate phages. We assess the contribution of temperate phage infection to the evolution of the clinically relevant strain UCBPP-PA14. We found that a low level of clustered regularly interspaced short palindromic repeat (CRISPR)-mediated self-targeting resulted in polylysogeny evolution and large genome rearrangements in lysogens; we also found extensive diversification in CRISPR spacers andcasgenes. These genomic modifications resulted in decreased spontaneous induction in both exponential and stationary phase growth, increasing lysogen fitness. This work shows the importance of considering latent phage infection in characterizing the evolution of bacterial populations.more » « less
-
Abstract Migration is driven by a combination of environmental and genetic factors, but many questions remain about those drivers. Potential interactions between genetic and environmental variants associated with different migratory phenotypes are rarely the focus of study. We pair low coverage whole genome resequencing with a de novo genome assembly to examine population structure, inbreeding, and the environmental factors associated with genetic differentiation between migratory and resident breeding phenotypes in a species of conservation concern, the western burrowing owl (Athene cunicularia hypugaea). Our analyses reveal a dichotomy in gene flow depending on whether the population is resident or migratory, with the former being genetically structured and the latter exhibiting no signs of structure. Among resident populations, we observed significantly higher genetic differentiation, significant isolation‐by‐distance, and significantly elevated inbreeding. Among migratory breeding groups, on the other hand, we observed lower genetic differentiation, no isolation‐by‐distance, and substantially lower inbreeding. Using genotype–environment association analysis, we find significant evidence for relationships between migratory phenotypes (i.e., migrant versus resident) and environmental variation associated with cold temperatures during the winter and barren, open habitats. In the regions of the genome most differentiated between migrants and residents, we find significant enrichment for genes associated with the metabolism of fats. This may be linked to the increased pressure on migrants to process and store fats more efficiently in preparation for and during migration. Our results provide a significant contribution toward understanding the evolution of migratory behavior and vital insight into ongoing conservation and management efforts for the western burrowing owl.more » « less
-
Summary Ketocarotenoids, including astaxanthin, are red lipophilic pigments derived from the oxygenation of β‐carotene ionone rings. These carotenoids have exceptional antioxidant capacity and high commercial value as natural pigments, especially for aquaculture feedstocks to confer red flesh colour to salmon and shrimp. Ketocarotenoid biosynthetic pathways occur only in selected bacterial, algal, fungal and plant species, which provide genetic resources for biotechnological ketocarotenoid production. Toward pathway optimization, we developed a transient platform for ketocarotenoid production usingAgrobacteriuminfiltration ofNicotiana benthamianaleaves with plant (Adonis aestivalis) genes, carotenoid β‐ring 4‐dehydrogenase 2 (CBFD2) and carotenoid 4‐hydroxy‐β‐ring 4‐dehydrogenase (HBFD1), or bacterial (Brevundimonas) genes, β‐carotene ketolase (crtW) and β‐carotene hydroxylase (crtZ). In this test system, heterologous expression of the plant‐derived astaxanthin pathway conferred higher astaxanthin production with fewer ketocarotenoid intermediates than the bacterial pathway. We evaluated the plant‐derived pathway for ketocarotenoid production using the oilseed camelina (Camelina sativa) as a production platform. Genes for CBFD2 and HBFD1 and maize phytoene synthase were introduced under the control of seed‐specific promoters. In contrast to prior research with bacterial pathways, our strategy resulted in nearly complete conversion of β‐carotene to ketocarotenoids, including primarily astaxanthin. Tentative identities of other ketocarotenoids were established by chemical evaluation. Seeds from multi‐season US and UK field sites maximally accumulated ~135 μg/g seed weight of ketocarotenoids, including astaxanthin (~47 μg/g seed weight). Although plants had no observable growth reduction, seed size and oil content were reduced in astaxanthin‐producing lines. Oil extracted from ketocarotenoid‐accumulating seeds showed significantly enhanced oxidative stability and was useful for food oleogel applications.more » « less
An official website of the United States government

