skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Architectural groups of a subtelomeric gene family evolve along distinct paths in Candida albicans
Abstract Subtelomeres are dynamic genomic regions shaped by elevated rates of recombination, mutation, and gene birth/death. These processes contribute to formation of lineage-specific gene family expansions that commonly occupy subtelomeres across eukaryotes. Investigating the evolution of subtelomeric gene families is complicated by the presence of repetitive DNA and high sequence similarity among gene family members that prevents accurate assembly from whole genome sequences. Here, we investigated the evolution of the telomere-associated (TLO) gene family in Candida albicans using 189 complete coding sequences retrieved from 23 genetically diverse strains across the species. Tlo genes conformed to the 3 major architectural groups (α/β/γ) previously defined in the genome reference strain but significantly differed in the degree of within-group diversity. One group, Tloβ, was always found at the same chromosome arm with strong sequence similarity among all strains. In contrast, diverse Tloα sequences have proliferated among chromosome arms. Tloγ genes formed 7 primary clades that included each of the previously identified Tloγ genes from the genome reference strain with 3 Tloγ genes always found on the same chromosome arm among strains. Architectural groups displayed regions of high conservation that resolved newly identified functional motifs, providing insight into potential regulatory mechanisms that distinguish groups. Thus, by resolving intraspecies subtelomeric gene variation, it is possible to identify previously unknown gene family complexity that may underpin adaptive functional variation.  more » « less
Award ID(s):
2046863 1638999
PAR ID:
10397789
Author(s) / Creator(s):
; ; ; ;
Editor(s):
Rokas, A
Date Published:
Journal Name:
G3 Genes|Genomes|Genetics
Volume:
12
Issue:
12
ISSN:
2160-1836
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Kim, J (Ed.)
    Abstract Though natural systems harbor genetic and phenotypic variation, research in model organisms is often restricted to a reference strain. Focusing on a reference strain yields a great depth of knowledge but potentially at the cost of breadth of understanding. Furthermore, tools developed in the reference context may introduce bias when applied to other strains, posing challenges to defining the scope of variation within model systems. Here, we evaluate how genetic differences among 5 wild Caenorhabditis elegans strains affect gene expression and its quantification, in general and after induction of the RNA interference (RNAi) response. Across strains, 34% of genes were differentially expressed in the control condition, including 411 genes that were not expressed at all in at least 1 strain; 49 of these were unexpressed in reference strain N2. Reference genome mapping bias caused limited concern: despite hyperdiverse hotspots throughout the genome, 92% of variably expressed genes were robust to mapping issues. The transcriptional response to RNAi was highly strain- and target-gene-specific and did not correlate with RNAi efficiency, as the 2 RNAi-insensitive strains showed more differentially expressed genes following RNAi treatment than the RNAi-sensitive reference strain. We conclude that gene expression, generally and in response to RNAi, differs across C. elegans strains such that the choice of strain may meaningfully influence scientific inferences. Finally, we introduce a resource for querying gene expression variation in this dataset at https://wildworm.biosci.gatech.edu/rnai/. 
    more » « less
  2. Koomey, Michael (Ed.)
    ABSTRACT Elizabethkingia anophelis is an emerging global multidrug-resistant opportunistic pathogen. We assessed the diversity among 13 complete genomes and 23 draft genomes of E. anophelis strains derived from various environmental settings and human infections from different geographic regions around the world from 1950s to the present. Putative integrative and conjugative elements (ICEs) were identified in 31/36 (86.1%) strains in the study. A total of 52 putative ICEs (including eight degenerated elements lacking integrases) were identified and categorized into three types based on the architecture of the conjugation module and the phylogeny of the relaxase, coupling protein, TraG, and TraJ protein sequences. The type II and III ICEs were found to integrate adjacent to tRNA genes, while type I ICEs integrate into intergenic regions or into a gene. The ICEs carry various cargo genes, including transcription regulator genes and genes conferring antibiotic resistance. The adaptive immune CRISPR-Cas system was found in nine strains, including five strains in which CRISPR-Cas machinery and ICEs coexist at different locations on the same chromosome. One ICE-derived spacer was present in the CRISPR locus in one strain. ICE distribution in the strains showed no geographic or temporal patterns. The ICEs in E. anophelis differ in architecture and sequence from CTnDOT, a well-studied ICE prevalent in Bacteroides spp. The categorization of ICEs will facilitate further investigations of the impact of ICE on virulence, genome epidemiology, and adaptive genomics of E. anophelis . IMPORTANCE Elizabethkingia anophelis is an opportunistic human pathogen, and the genetic diversity between strains from around the world becomes apparent as more genomes are sequenced. Genome comparison identified three types of putative ICEs in 31 of 36 strains. The diversity of ICEs suggests that they had different origins. One of the ICEs was discovered previously from a large E. anophelis outbreak in Wisconsin in the United States; this ICE has integrated into the mutY gene of the outbreak strain, creating a mutator phenotype. Similar to ICEs found in many bacterial species, ICEs in E. anophelis carry various cargo genes that enable recipients to resist antibiotics and adapt to various ecological niches. The adaptive immune CRISPR-Cas system is present in nine of 36 strains. An ICE-derived spacer was found in the CRISPR locus in a strain that has no ICE, suggesting a past encounter and effective defense against ICE. 
    more » « less
  3. Wittkopp, Patricia (Ed.)
    Abstract Recent pangenome studies have revealed a large fraction of the gene content within a species exhibits presence-absence variation (PAV). However, coding regions alone provide an incomplete assessment of functional genomic sequence variation at the species level. Little to no attention has been paid to noncoding regulatory regions in pangenome studies, though these sequences directly modulate gene expression and phenotype. To uncover regulatory genetic variation, we generated chromosome-scale genome assemblies for thirty Arabidopsis thaliana accessions from multiple distinct habitats and characterized species level variation in Conserved Noncoding Sequences (CNS). Our analyses uncovered not only PAV and positional variation (PosV) but that diversity in CNS is non-random, with variants shared across different accessions. Using evolutionary analyses and chromatin accessibility data, we provide further evidence supporting roles for conserved and variable CNS in gene regulation. Additionally, our data suggests transposable elements contribute to CNS variation. Characterizing species-level diversity in all functional genomic sequences may later uncover previously unknown mechanistic links between genotype and phenotype. 
    more » « less
  4. Genetically modified organisms are commonly used in disease research and agriculture but the precise genomic alterations underlying transgenic mutations are often unknown. The position and characteristics of transgenes, including the number of independent insertions, influences the expression of both transgenic and wild-type sequences. We used long-read, Oxford Nanopore Technologies (ONT) to sequence and assemble two transgenic strains ofCaenorhabditis eleganscommonly used in the research of neurodegenerative diseases: BY250 (pPdat-1::GFP) and UA44 (GFP and humanα-synuclein), a model for Parkinson’s research. After scaffolding to the reference, the final assembled sequences were ∼102 Mb with N50s of 17.9 Mb and 18.0 Mb, respectively, and L90s of six contiguous sequences, representing chromosome-level assemblies. Each of the assembled sequences contained more than 99.2% of the Nematoda BUSCO genes found in theC. elegansreference and 99.5% of the annotatedC. elegansreference protein-coding genes. We identified the locations of the transgene insertions and confirmed that all transgene sequences were inserted in intergenic regions, leaving the organismal gene content intact. The transgenicC. elegansgenomes presented here will be a valuable resource for Parkinson’s research as well as other neurodegenerative diseases. Our work demonstrates that long-read sequencing is a fast, cost-effective way to assemble genome sequences and characterize mutant lines and strains. 
    more » « less
  5. null (Ed.)
    Plant subtilases (SBTs) or subtilisin-like proteases comprise a very diverse family of serine peptidases that participates in a broad spectrum of biological functions. Despite increasing evidence for roles of SBTs in plant immunity in recent years, little is known about wheat (Triticum aestivum) SBTs (TaSBTs). Here, we identified 255 TaSBT genes from bread wheat using the latest version 2.0 of the reference genome sequence. The SBT family can be grouped into five clades, from TaSBT1 to TaSBT5, based on a phylogenetic tree constructed with deduced protein sequences. In silico protein-domain analysis revealed the existence of considerable sequence diversification of the TaSBT family which, together with the local clustered gene distribution, suggests that TaSBT genes have undergone extensive functional diversification. Among those TaSBT genes whose expression was altered by biotic factors, TaSBT1.7 was found to be induced in wheat leaves by chitin and flg22 elicitors, as well as six examined pathogens, implying a role for TaSBT1.7 in plant defense. Transient overexpression of TaSBT1.7 in Nicotiana benthamiana leaves resulted in necrotic cell death. Moreover, knocking down TaSBT1.7 in wheat using barley stripe mosaic virus-induced gene silencing compromised the hypersensitive response and resistance against Puccinia striiformis f. sp. tritici, the causal agent of wheat stripe rust. Taken together, this study defined the full complement of wheat SBT genes and provided evidence for a positive role of one particular member, TaSBT1.7, in the incompatible interaction between wheat and a stripe rust pathogen. 
    more » « less