Insect silk is a versatile biomaterial. Lepidoptera and Trichoptera display some of the most diverse uses of silk, with varying strength, adhesive qualities, and elastic properties. Silk fibroin genes are long (>20 Kbp), with many repetitive motifs that make them challenging to sequence. Most research thus far has focused on conserved N- and C-terminal regions of fibroin genes because a full comparison of repetitive regions across taxa has not been possible. Using the PacBio Sequel II system and SMRT sequencing, we generated high fidelity (HiFi) long-read genomic and transcriptomic sequences for the Indianmeal moth (Plodia interpunctella) and genomic sequences for the caddisfly Eubasilissa regina. Both genomes were highly contiguous (N50 = 9.7 Mbp/32.4 Mbp, L50 = 13/11) and complete (BUSCO complete = 99.3%/95.2%), with complete and contiguous recovery of silk heavy fibroin gene sequences. We show that HiFi long-read sequencing is helpful for understanding genes with long, repetitive regions.
more »
« less
Allelic resolution of insect and spider silk genes reveals hidden genetic diversity
Arthropod silk is vital to the evolutionary success of hundreds of thousands of species. The primary proteins in silks are often encoded by long, repetitive gene sequences. Until recently, sequencing and assembling these complex gene sequences has proven intractable given their repetitive structure. Here, using high-quality long-read sequencing, we show that there is extensive variation—both in terms of length and repeat motif order—between alleles of silk genes within individual arthropods. Further, this variation exists across two deep, independent origins of silk which diverged more than 500 Mya: the insect clade containing caddisflies and butterflies and spiders. This remarkable convergence in previously overlooked patterns of allelic variation across multiple origins of silk suggests common mechanisms for the generation and maintenance of structural protein-coding genes. Future genomic efforts to connect genotypes to phenotypes should account for such allelic variation.
more »
« less
- PAR ID:
- 10426965
- Date Published:
- Journal Name:
- Proceedings of the National Academy of Sciences
- Volume:
- 120
- Issue:
- 18
- ISSN:
- 0027-8424
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Macqueen, D (Ed.)Abstract Spider silks are renowned for their high-performance mechanical properties. Contributing to these properties are proteins encoded by the spidroin (spider fibroin) gene family. Spidroins have been discovered mostly through cDNA studies of females based on the presence of conserved terminal regions and a repetitive central region. Recently, genome sequencing of the golden orb-web weaver, Trichonephila clavipes, provided a complete picture of spidroin diversity. Here, we refine the annotation of T. clavipes spidroin genes including the reclassification of some as non-spidroins. We rename these non-spidroins as spidroin-like (SpL) genes because they have repetitive sequences and amino acid compositions like spidroins, but entirely lack the archetypal terminal domains of spidroins. Insight into the function of these spidroin and SpL genes was then examined through tissue- and sex-specific gene expression studies. Using qPCR, we show that some silk genes are upregulated in male silk glands compared to females, despite males producing less silk in general. We also find that an enigmatic spidroin that lacks a spidroin C-terminal domain is highly expressed in silk glands, suggesting that spidroins could assemble into fibers without a canonical terminal region. Further, we show that two SpL genes are expressed in silk glands, with one gene highly evolutionarily conserved across species, providing evidence that particular SpL genes are important to silk production. Together, these findings challenge long-standing paradigms regarding the evolutionary and functional significance of the proteins and conserved motifs essential for producing spider silks.more » « less
-
Red tilapia are favored by consumers, but the molecular genetic basis for this color pattern is unknown. Here we report on the genetic and physical mapping of the red locus in two strains of tilapia. We raised ~3000 hybrid individuals to map the red locus to a single bacterial artificial chromosome clone on linkage group 3. Long-read sequencing allowed us to assemble contigs spanning both the black and red haplotypes. The red haplotype contains additional repetitive sequence totaling almost one megabase that includes no obvious candidate genes. We suggest that the red phenotype may arise from substitutions in a protein in the primary cilia (Ccdc149), or changes in the expression of a nearby gene (nckx2). Red mutations in several unlinked loci have now been identified, creating an opportunity to identify the best allelic combinations for aquacultural production.more » « less
-
Rokas, A (Ed.)Abstract Subtelomeres are dynamic genomic regions shaped by elevated rates of recombination, mutation, and gene birth/death. These processes contribute to formation of lineage-specific gene family expansions that commonly occupy subtelomeres across eukaryotes. Investigating the evolution of subtelomeric gene families is complicated by the presence of repetitive DNA and high sequence similarity among gene family members that prevents accurate assembly from whole genome sequences. Here, we investigated the evolution of the telomere-associated (TLO) gene family in Candida albicans using 189 complete coding sequences retrieved from 23 genetically diverse strains across the species. Tlo genes conformed to the 3 major architectural groups (α/β/γ) previously defined in the genome reference strain but significantly differed in the degree of within-group diversity. One group, Tloβ, was always found at the same chromosome arm with strong sequence similarity among all strains. In contrast, diverse Tloα sequences have proliferated among chromosome arms. Tloγ genes formed 7 primary clades that included each of the previously identified Tloγ genes from the genome reference strain with 3 Tloγ genes always found on the same chromosome arm among strains. Architectural groups displayed regions of high conservation that resolved newly identified functional motifs, providing insight into potential regulatory mechanisms that distinguish groups. Thus, by resolving intraspecies subtelomeric gene variation, it is possible to identify previously unknown gene family complexity that may underpin adaptive functional variation.more » « less
-
Suh, Alexander (Ed.)Abstract Although spiders are one of the most diverse groups of arthropods, the genetic architecture of their evolutionary adaptations is largely unknown. Specifically, ancient genome-wide duplication occurring during arachnid evolution ~450 mya resulted in a vast assembly of gene families, yet the extent to which selection has shaped this variation is understudied. To aid in comparative genome sequence analyses, we provide a chromosome-level genome of the Western black widow spider (Latrodectus hesperus)—a focus due to its silk properties, venom applications, and as a model for urban adaptation. We used long-read and Hi-C sequencing data, combined with transcriptomes, to assemble 14 chromosomes in a 1.46 Gb genome, with 38,393 genes annotated, and a BUSCO score of 95.3%. Our analyses identified high repetitive gene content and heterozygosity, consistent with other spider genomes, which has led to challenges in genome characterization. Our comparative evolutionary analyses of eight genomes available for species within the Araneoidea group (orb weavers and their descendants) identified 1,827 single-copy orthologs. Of these, 155 exhibit significant positive selection primarily associated with developmental genes, and with traits linked to sensory perception. These results support the hypothesis that several traits unique to spiders emerged from the adaptive evolution of ohnologs—or retained ancestrally duplicated genes—from ancient genome-wide duplication. These comparative spider genome analyses can serve as a model to understand how positive selection continually shapes ancestral duplications in generating novel traits today within and between diverse taxonomic groups.more » « less
An official website of the United States government

