skip to main content


Title: Signal Peptides Generated by Attention-Based Neural Networks
Short (15–30 residue) chains of amino acids at the amino termini of expressed proteins known as signal peptides (SPs) specify secretion in living cells. We trained an attention-based neural network, the Transformer model, on data from all available organisms in Swiss-Prot to generate SP sequences. Experimental testing demonstrates that the model-generated SPs are functional: when appended to enzymes expressed in an industrial Bacillus subtilis strain, the SPs lead to secreted activity that is competitive with industrially used SPs. Additionally, the model-generated SPs are diverse in sequence, sharing as little as 58% sequence identity to the closest known native signal peptide and 73% ± 9% on average.  more » « less
Award ID(s):
1937902
NSF-PAR ID:
10179180
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
ACS Synthetic Biology
ISSN:
2161-5063
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Agrobacterium effector protein VirE2 is important for plant transformation. VirE2 likely coats transferred DNA (T-DNA) in the plant cell and protects it from degradation. VirE2 localizes to the plant cytoplasm and interacts with several host proteins. Plant-expressed VirE2 can complement a virE2 mutant Agrobacterium strain to support transformation. We investigated whether VirE2 could facilitate transformation from a nuclear location by affixing to it a strong nuclear localization signal (NLS) sequence. Only cytoplasmic-, but not nuclear-localized, VirE2 could stimulate transformation. To investigate the ways VirE2 supports transformation, we generated transgenic Arabidopsis plants containing a virE2 gene under the control of an inducible promoter and performed RNA-seq and proteomic analyses before and after induction. Some differentially expressed plant genes were previously known to facilitate transformation. Knockout mutant lines of some other VirE2 differentially expressed genes showed altered transformation phenotypes. Levels of some proteins known to be important for transformation increased in response to VirE2 induction, but prior to or without induction of their corresponding mRNAs. Overexpression of some other genes whose proteins increased after VirE2 induction resulted in increased transformation susceptibility. We conclude that cytoplasmically localized VirE2 modulates both plant RNA and protein levels to facilitate transformation. 
    more » « less
  2. ABSTRACT Plant pathogens utilize a portfolio of secreted effectors to successfully infect and manipulate their hosts. It is, however, still unclear whether changes in secretomes leading to host specialization involve mostly effector gene gains/losses or changes in their sequences. To test these hypotheses, we compared the secretomes of three host-specific castrating anther smut fungi ( Microbotryum ), two being sister species. To address within-species evolution, which might involve coevolution and local adaptation, we compared the secretomes of strains from differentiated populations. We experimentally validated a subset of signal peptides. Secretomes ranged from 321 to 445 predicted secreted proteins (SPs), including a few species-specific proteins (42 to 75), and limited copy number variation, i.e., little gene family expansion or reduction. Between 52% and 68% of the SPs did not match any Pfam domain, a percentage that reached 80% for the small secreted proteins, indicating rapid evolution. In comparison to background genes, we indeed found SPs to be more differentiated among species and strains, more often under positive selection, and highly expressed in planta ; repeat-induced point mutations (RIPs) had no role in effector diversification, as SPs were not closer to transposable elements than background genes and were not more RIP affected. Our study thus identified both conserved core proteins, likely required for the pathogenic life cycle of all Microbotryum species, and proteins that were species specific or evolving under positive selection; these proteins may be involved in host specialization and/or coevolution. Most changes among closely related host-specific pathogens, however, involved rapid changes in sequences rather than gene gains/losses. IMPORTANCE Plant pathogens use molecular weapons to successfully infect their hosts, secreting a large portfolio of various proteins and enzymes. Different plant species are often parasitized by host-specific pathogens; however, it is still unclear whether the molecular basis of such host specialization involves species-specific weapons or different variants of the same weapons. We therefore compared the genes encoding secreted proteins in three plant-castrating pathogens parasitizing different host plants, producing their spores in plant anthers by replacing pollen. We validated our predictions for secretion signals for some genes and checked that our predicted secreted proteins were often highly expressed during plant infection. While we found few species-specific secreted proteins, numerous genes encoding secreted proteins showed signs of rapid evolution and of natural selection. Our study thus found that most changes among closely related host-specific pathogens involved rapid adaptive changes in shared molecular weapons rather than innovations for new weapons. 
    more » « less
  3. Abstract

    Although many insects are associated with obligate bacterial endosymbionts, the mechanisms by which these host/endosymbiont associations are regulated remain mysterious. While microRNAs (miRNAs) have been recently identified as regulators of host/microbe interactions, including host/pathogen and host/facultative endosymbiont interactions, the role miRNAs may play in mediating host/obligate endosymbiont interactions is virtually unknown. Here, we identified conserved miRNAs that potentially mediate symbiotic interactions between aphids and their obligate endosymbiont,Buchnera aphidicola. Using smallRNAsequence data fromMyzus persicaeandAcyrthosiphon pisum, we annotated 93M. persicaeand 89A. pisummiRNAs, among which 69 were shared. We found 14 miRNAs that were either highly expressed in aphid bacteriome, theBuchnera‐housing tissue, or differentially expressed in bacteriome vs. gut, a non‐Buchnera‐housing tissue. Strikingly, 10 of these 14 miRNAs have been implicated previously in other host/microbe interaction studies. Investigating the interaction networks of these miRNAs using a custom computational pipeline, we identified 103 miRNA::mRNAinteractions shared betweenM. persicaeandA. pisum. Functional annotation of the sharedmRNAtargets revealed only two over‐represented cluster of orthologous group categories: amino acid transport and metabolism, and signal transduction mechanisms. Our work supports a role for miRNAs in mediating host/symbiont interactions between aphids and their obligate endosymbiontBuchnera. In addition, our results highlight the probable importance of signal transduction mechanisms to host/endosymbiont coevolution.

     
    more » « less
  4. Recent advances in protein structure prediction have generated accurate structures of previously uncharacterized human proteins. Identifying domains in these predicted structures and classifying them into an evolutionary hierarchy can reveal biological insights. Here, we describe the detection and classification of domains from the human proteome. Our classification indicates that only 62% of residues are located in globular domains. We further classify these globular domains and observe that the majority (65%) can be classified among known folds by sequence, with a smaller fraction (33%) requiring structural data to refine the domain boundaries and/or to support their homology. A relatively small number (966 domains) cannot be confidently assigned using our automatic pipelines, thus demanding manual inspection. We classify 47,576 domains, of which only 23% have been included in experimental structures. A portion (6.3%) of these classified globular domains lack sequence-based annotation in InterPro. A quarter (23%) have not been structurally modeled by homology, and they contain 2,540 known disease-causing single amino acid variations whose pathogenesis can now be inferred using AF models. A comparison of classified domains from a series of model organisms revealed expansions of several immune response-related domains in humans and a depletion of olfactory receptors. Finally, we use this classification to expand well-known protein families of biological significance. These classifications are presented on the ECOD website ( http://prodata.swmed.edu/ecod/index_human.php ). 
    more » « less
  5. Abstract

    Maternal inheritance of mitochondria creates a sex‐specific selective sieve through which mitochondrial mutations harmful to males but not females accumulate and contribute to sexual differences in longevity and disease susceptibility. Because eggs and sperm are under disruptive selection, sperm are predicted to be particularly vulnerable to the genetic load generated by maternal inheritance, yet evidence for mitochondrial involvement in male fertility is limited and controversial. Here, we exploit the coexistence of two divergent mitochondrial haplogroups (A and B2) in a Neotropical arachnid to investigate the role of mitochondria in sperm competition. DNA profiling demonstrated that B2‐carrying males sired more than three times as many offspring in sperm competition experiments than A males, and this B2 competitive advantage cannot be explained by female mitochondrial haplogroup or male nuclear genetic background. RNA‐Seq of testicular tissues implicates differential expression of mitochondrial oxidative phosphorylation (OXPHOS) genes in the B2 competitive advantage, including a 22‐fold upregulation ofatp8in B2 males. Previous comparative genomic analyses have revealed functionally significant amino acid substitutions in differentially expressed genes, indicating that the mitochondrial haplogroups differ not only in expression but also in DNA sequence and protein functioning. However, mitochondrial haplogroup had no effect on sperm number or sperm viability, and, when females were mated to a single male, neither male haplogroup, female haplogroup nor the interaction between male/female haplogroup significantly affected female reproductive success. Our findings therefore suggest that mitochondrial effects on male reproduction may often go undetected in noncompetitive contexts and may prove more important in nature than is currently appreciated.

     
    more » « less