skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Venomix: a simple bioinformatic pipeline for identifying and characterizing toxin gene candidates from transcriptomic data
The advent of next-generation sequencing has resulted in transcriptome-based approaches to investigate functionally significant biological components in a variety of non-model organism. This has resulted in the area of “venomics”: a rapidly growing field using combined transcriptomic and proteomic datasets to characterize toxin diversity in a variety of venomous taxa. Ultimately, the transcriptomic portion of these analyses follows very similar pathways after transcriptome assembly often including candidate toxin identification using BLAST, expression level screening, protein sequence alignment, gene tree reconstruction, and characterization of potential toxin function. Here we describe the Python package Venomix, which streamlines these processes using common bioinformatic tools along with ToxProt, a publicly available annotated database comprised of characterized venom proteins. In this study, we use the Venomix pipeline to characterize candidate venom diversity in four phylogenetically distinct organisms, a cone snail (Conidae; Conus sponsalis ), a snake (Viperidae; Echis coloratus ), an ant (Formicidae; Tetramorium bicarinatum ), and a scorpion (Scorpionidae; Urodacus yaschenkoi ). Data on these organisms were sampled from public databases, with each original analysis using different approaches for transcriptome assembly, toxin identification, or gene expression quantification. Venomix recovered numerically more candidate toxin transcripts for three of the four transcriptomes than the original analyses and identified new toxin candidates. In summary, we show that the Venomix package is a useful tool to identify and characterize the diversity of toxin-like transcripts derived from transcriptomic datasets. Venomix is available at: https://bitbucket.org/JasonMacrander/Venomix/ .  more » « less
Award ID(s):
1401014
PAR ID:
10363782
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
PeerJ
Volume:
6
ISSN:
2167-8359
Page Range / eLocation ID:
e5361
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The venoms of small rear-fanged snakes (RFS) remain largely unexplored, despite increased recognition of their importance in understanding venom evolution more broadly. Sequencing the transcriptome of venom-producing glands has greatly increased the ability of researchers to examine and characterize the toxin repertoire of small taxa with low venom yields. Here, we use RNA-seq to characterize the Duvernoy’s gland transcriptome of the Plains Black-headed Snake, Tantilla nigriceps, a small, semi-fossorial colubrid that feeds on a variety of potentially dangerous arthropods including centipedes and spiders. We generated transcriptomes of six individuals from three localities in order to both characterize the toxin expression of this species for the first time, and to look for initial evidence of venom variation in the species. Three toxin families—three-finger neurotoxins (3FTxs), cysteine-rich secretory proteins (CRISPs), and snake venom metalloproteinases (SVMPIIIs)—dominated the transcriptome of T. nigriceps; 3FTx themselves were the dominant toxin family in most individuals, accounting for as much as 86.4% of an individual’s toxin expression. Variation in toxin expression between individuals was also noted, with two specimens exhibiting higher relative expression of c-type lectins than any other sample (8.7–11.9% compared to <1%), and another expressed CRISPs higher than any other toxin. This study provides the first Duvernoy’s gland transcriptomes of any species of Tantilla, and one of the few transcriptomic studies of RFS not predicated on a single individual. This initial characterization demonstrates the need for further study of toxin expression variation in this species, as well as the need for further exploration of small RFS venoms. 
    more » « less
  2. Yoder, Anne (Ed.)
    Abstract Understanding the joint roles of protein sequence variation and differential expression during adaptive evolution is a fundamental, yet largely unrealized goal of evolutionary biology. Here, we use phylogenetic path analysis to analyze a comprehensive venom-gland transcriptome dataset spanning three genera of pitvipers to identify the functional genetic basis of a key adaptation (venom complexity) linked to diet breadth (DB). The analysis of gene-family-specific patterns reveals that, for genes encoding two of the most important venom proteins (snake venom metalloproteases and snake venom serine proteases), there are direct, positive relationships between sequence diversity (SD), expression diversity (ED), and increased DB. Further analysis of gene-family diversification for these proteins showed no constraint on how individual lineages achieved toxin gene SD in terms of the patterns of paralog diversification. In contrast, another major venom protein family (PLA2s) showed no relationship between venom molecular diversity and DB. Additional analyses suggest that other molecular mechanisms—such as higher absolute levels of expression—are responsible for diet adaptation involving these venom proteins. Broadly, our findings argue that functional diversity generated through sequence and expression variations jointly determine adaptation in the key components of pitviper venoms, which mediate complex molecular interactions between the snakes and their prey. 
    more » « less
  3. null (Ed.)
    Ontogenetic changes in venom composition have been described in Bothrops snakes, but only a few studies have attempted to identify the targeted paralogues or the molecular mechanisms involved in modifications of gene expression during ontogeny. In this study, we decoded B. jararacussu venom gland transcripts from six specimens of varying sizes and analyzed the variability in the composition of independent venom proteomes from 19 individuals. We identified 125 distinct putative toxin transcripts, and of these, 73 were detected in venom proteomes and only 10 were involved in the ontogenetic changes. Ontogenetic variability was linearly related to snake size and did not correspond to the maturation of the reproductive stage. Changes in the transcriptome were highly predictive of changes in the venom proteome. The basic myotoxic phospholipases A2 (PLA2s) were the most abundant components in larger snakes, while in venoms from smaller snakes, PIII-class SVMPs were the major components. The snake venom metalloproteinases (SVMPs) identified corresponded to novel sequences and conferred higher pro-coagulant and hemorrhagic functions to the venom of small snakes. The mechanisms modulating venom variability are predominantly related to transcriptional events and may consist of an advantage of higher hematotoxicity and more efficient predatory function in the venom from small snakes. 
    more » « less
  4. Abstract Rapid development of transcriptome sequencing technologies has resulted in a data revolution and emergence of new approaches to study transcriptomic regulation such as alternative splicing, alternative polyadenylation, CRISPR knockout screening in addition to the regular gene expression. A full characterization of the transcriptional landscape of different groups of cells or tissues holds enormous potential for both basic science as well as clinical applications. Although many methods have been developed in the realm of differential gene expression analysis, they all geared towards a particular type of sequencing data and failed to perform well when applied in different types of transcriptomic data. To fill this gap, we offer a negative beta binomial t-test (NBBt-test). NBBt-test provides multiple functions to perform differential analyses of alternative splicing, polyadenylation, CRISPR knockout screening, and gene expression datasets. Both real and large-scale simulation data show superior performance of NBBt-test with higher efficiency, and lower type I error rate and FDR to identify differential isoforms and differentially expressed genes and differential CRISPR knockout screening genes with different sample sizes when compared against the current very popular statistical methods. An R-package implementing NBBt-test is available for downloading from CRAN ( https://CRAN.R-project.org/package=NBBttest ). 
    more » « less
  5. Understanding the molecular mechanisms that underlie snake venom variability provides important clues for understanding how the biological functions of this powerful toxic arsenal evolve. Here we analyzed in detail individual transcripts and venom protein isoforms produced by five specimens of a venomous snake (Bothrops atrox) from two nearby but genetically distinct populations from the Brazilian Amazon rainforest showing functional similarities in venom properties. Individual variation was observed among the venoms of these specimens, but the overall abundance of each general toxin family was conserved both in transcripts and in venom protein levels. However, when expression of independent paralogues was analyzed, remarkable differences were observed within and among each toxin group both between individuals and between populations. Transcripts for functionally essential venom proteins (“housekeeping” proteins) are highly expressed in all specimens and show similar transcription/translation rates. In contrast, other paralogues show lower expression levels and the toxins they code for vary among different individuals. These results provide support for the idea that that expression and translational differences play a greater role in defining adaptive variation in venom phenotypes than does sequence variation in protein coding genes and that convergent adaptive venom phenotypes can be generated through different molecular mechanisms. 
    more » « less