skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Orphan Genes Shared by Pathogenic Genomes Are More Associated with Bacterial Pathogenicity
ABSTRACT Orphan genes (also known as ORFans [i.e., orphan open reading frames]) are new genes that enable an organism to adapt to its specific living environment. Our focus in this study is to compare ORFans between pathogens (P) and nonpathogens (NP) of the same genus. Using the pangenome idea, we have identified 130,169 ORFans in nine bacterial genera (505 genomes) and classified these ORFans into four groups: (i) SS-ORFans (P), which are only found in a single pathogenic genome; (ii) SS-ORFans (NP), which are only found in a single nonpathogenic genome; (iii) PS-ORFans (P), which are found in multiple pathogenic genomes; and (iv) NS-ORFans (NP), which are found in multiple nonpathogenic genomes. Within the same genus, pathogens do not always have more genes, more ORFans, or more pathogenicity-related genes (PRGs)—including prophages, pathogenicity islands (PAIs), virulence factors (VFs), and horizontal gene transfers (HGTs)—than nonpathogens. Interestingly, in pathogens of the nine genera, the percentages of PS-ORFans are consistently higher than those of SS-ORFans, which is not true in nonpathogens. Similarly, in pathogens of the nine genera, the percentages of PS-ORFans matching the four types of PRGs are also always higher than those of SS-ORFans, but this is not true in nonpathogens. All of these findings suggest the greater importance of PS-ORFans for bacterial pathogenicity. IMPORTANCE Recent pangenome analyses of numerous bacterial species have suggested that each genome of a single species may have a significant fraction of its gene content unique or shared by a very few genomes (i.e., ORFans). We selected nine bacterial genera, each containing at least five pathogenic and five nonpathogenic genomes, to compare their ORFans in relation to pathogenicity-related genes. Pathogens in these genera are known to cause a number of common and devastating human diseases such as pneumonia, diphtheria, melioidosis, and tuberculosis. Thus, they are worthy of in-depth systems microbiology investigations, including the comparative study of ORFans between pathogens and nonpathogens. We provide direct evidence to suggest that ORFans shared by more pathogens are more associated with pathogenicity-related genes and thus are more important targets for development of new diagnostic markers or therapeutic drugs for bacterial infectious diseases.  more » « less
Award ID(s):
1652164
PAR ID:
10085828
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
mSystems
Volume:
4
Issue:
1
ISSN:
2379-5077
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Li-Jun, Ma (Ed.)
    Abstract Fungi of the genus Botrytis infect >1,400 plant species and cause losses in many crops. Besides the broad host range pathogen Botrytis cinerea, most other species are restricted to a single host. Long-read technology was used to sequence genomes of eight Botrytis species, mostly pathogenic on Allium species, and the related onion white rot fungus, Sclerotium cepivorum. Most assemblies contained <100 contigs, with the Botrytis aclada genome assembled in 16 gapless chromosomes. The core genome and pan-genome of 16 Botrytis species were defined and the secretome, effector, and secondary metabolite repertoires analyzed. Among those genes, none is shared among all Allium pathogens and absent from non-Allium pathogens. The genome of each of the Allium pathogens contains 8–39 predicted effector genes that are unique for that single species, none stood out as potential determinant for host specificity. Chromosome configurations of common ancestors of the genus Botrytis and family Sclerotiniaceae were reconstructed. The genomes of B. cinerea and B. aclada were highly syntenic with only 19 rearrangements between them. Genomes of Allium pathogens were compared with ten other Botrytis species (nonpathogenic on Allium) and with 25 Leotiomycetes for their repertoire of secondary metabolite gene clusters. The pattern was complex, with several clusters displaying patchy distribution. Two clusters involved in the synthesis of phytotoxic metabolites are at distinct genomic locations in different Botrytis species. We provide evidence that the clusters for botcinic acid production in B. cinerea and Botrytis sinoallii were acquired by horizontal transfer from taxa within the same genus. 
    more » « less
  2. ‘CandidatusLiberibacter’ is a group of bacterial species that are obligate intracellular plant pathogens and cause Huanglongbing disease of citrus trees and Zebra Chip in potatoes. Here, we examined the extent of intra- and interspecific genetic diversity across the genus using comparative genomics. Our approach examined a wide set ofLiberibactergenome sequences including five pathogenic species and one species not known to cause disease. By performing comparative genomics analyses, we sought to understand the evolutionary history of this genus and to identify genes or genome regions that may affect pathogenicity. With a set of 52 genomes, we performed comparative genomics, measured genome rearrangement, and completed statistical tests of positive selection. We explored markers of genetic diversity across the genus, such as average nucleotide identity across the whole genome. These analyses revealed the highest intraspecific diversity amongst the ‘Ca.Liberibacter solanacearum’ species, which also has the largest plant host range. We identified sets of core and accessory genes across the genus and within each species and measured the ratio of nonsynonymous to synonymous mutations (dN/dS) across genes. We identified ten genes with evidence of a history of positive selection in theLiberibactergenus, including genes in the Tad complex, which have been previously implicated as being highly divergent in the ‘Ca.L. capsica’ species based on high values of dN. 
    more » « less
  3. Burbank, Lindsey Price (Ed.)
    ABSTRACT Liberibacter pathogens are the causative agents of several severe crop diseases worldwide, including citrus Huanglongbing and potato zebra chip. These bacteria are endophytic and nonculturable, which makes experimental approaches challenging and highlights the need for bioinformatic analysis in advancing our understanding about Liberibacter pathogenesis. Here, we performed an in-depth comparative phylogenomic analysis of the Liberibacter pathogens and their free-living, nonpathogenic, ancestral species, aiming to identify major genomic changes and determinants associated with their evolutionary transitions in living habitats and pathogenicity. Using gene neighborhood analysis and phylogenetic classification, we systematically uncovered, annotated, and classified all prophage loci into four types, including one previously unrecognized group. We showed that these prophages originated through independent gene transfers at different evolutionary stages of Liberibacter and only the SC-type prophage was associated with the emergence of the pathogens. Using ortholog clustering, we vigorously identified two additional sets of genomic genes, which were either lost or gained in the ancestor of the pathogens. Consistent with the habitat change, the lost genes were enriched for biosynthesis of cellular building blocks. Importantly, among the gained genes, we uncovered several previously unrecognized toxins, including new toxins homologous to the EspG/VirA effectors, a YdjM phospholipase toxin, and a secreted endonuclease/exonuclease/phosphatase (EEP) protein. Our results substantially extend the knowledge of the evolutionary events and potential determinants leading to the emergence of endophytic, pathogenic Liberibacter species, which will facilitate the design of functional experiments and the development of new methods for detection and blockage of these pathogens. IMPORTANCE Liberibacter pathogens are associated with several severe crop diseases, including citrus Huanglongbing, the most destructive disease to the citrus industry. Currently, no effective cure or treatments are available, and no resistant citrus variety has been found. The fact that these obligate endophytic pathogens are not culturable has made it extremely challenging to experimentally uncover the genes/proteins important to Liberibacter pathogenesis. Further, earlier bioinformatics studies failed to identify key genomic determinants, such as toxins and effector proteins, that underlie the pathogenicity of the bacteria. In this study, an in-depth comparative genomic analysis of Liberibacter pathogens along with their ancestral nonpathogenic species identified the prophage loci and several novel toxins that are evolutionarily associated with the emergence of the pathogens. These results shed new light on the disease mechanism of Liberibacter pathogens and will facilitate the development of new detection and blockage methods targeting the toxins. 
    more » « less
  4. Alexandre, Gladys (Ed.)
    ABSTRACT Xylella fastidiosa infects several economically important crops in the Americas, and it also recently emerged in Europe. Here, using a set of Xylella genomes reflective of the genus-wide diversity, we performed a pan-genome analysis based on both core and accessory genes for two purposes: (i) to test associations between genetic divergence and plant host species and (ii) to identify positively selected genes that are potentially involved in arms-race dynamics. For the former, tests yielded significant evidence for the specialization of X. fastidiosa to plant host species. This observation contributes to a growing literature suggesting that the phylogenetic history of X. fastidiosa lineages affects the host range. For the latter, our analyses uncovered evidence of positive selection across codons for 5.3% (67 of 1,257) of the core genes and 5.4% (201 of 3,691) of the accessory genes. These genes are candidates to encode interacting factors with plant and insect hosts. Most of these genes had unknown functions, but we did identify some tractable candidates, including nagZ_2 , which encodes a beta-glucosidase that is important for Neisseria gonorrhoeae biofilm formation; cya , which modulates gene expression in pathogenic bacteria, and barA , a membrane associated histidine kinase that has roles in cell division, metabolism, and pili formation. IMPORTANCE Xylella fastidiosa causes devasting diseases to several critical crops. Because X. fastidiosa colonizes and infects many plant species, it is important to understand whether the genome of X. fastidiosa has genetic determinants that underlie specialization to specific host plants. We analyzed genome sequences of X. fastidiosa to investigate evolutionary relationships and to test for evidence of positive selection on specific genes. We found a significant signal between genome diversity and host plants, consistent with bacterial specialization to specific plant hosts. By screening for positive selection, we identified both core and accessory genes that may affect pathogenicity, including genes involved in biofilm formation. 
    more » « less
  5. Hayer, Juliette (Ed.)
    Staphylococcus aureus causes both hospital- and community-acquired infections in humans worldwide. Due to the high incidence of infection, S. aureus is also one of the most sampled and sequenced pathogens today, providing an outstanding resource to understand variation at the bacterial subspecies level. We processed and downsampled 83,383 public S. aureus Illumina whole-genome shotgun sequences and 1,263 complete genomes to produce 7,954 representative substrains. Pairwise comparison of average nucleotide identity revealed a natural boundary of 99.5% that could be used to define 145 distinct strains within the species. We found that intermediate frequency genes in the pangenome (present in 10%–95% of genomes) could be divided into those closely linked to strain background (“strain-concentrated”) and those highly variable within strains (“strain-diffuse”). Non-core genes had different patterns of chromosome location. Notably, strain-diffuse genes were associated with prophages; strain-concentrated genes were associated with the vSaβ genome island and rare genes (<10% frequency) concentrated near the origin of replication. Antibiotic resistance genes were enriched in the strain-diffuse class, while virulence genes were distributed between strain-diffuse, strain-concentrated, core, and rare classes. This study shows how different patterns of gene movement help create strains as distinct subspecies entities and provide insight into the diverse histories of important S. aureus functions. 
    more » « less