skip to main content

Title: No Evidence of Copy Number Variation in Acidic Mammalian Chitinase Genes (CHIA) in New World and Old World Monkeys
Copy number variation may be the most common form of structural genetic variation in the genome. Numerous studies have shown that gene copy number variation can correlate with phenotypic variation, where higher copy numbers correspond to increased expression of the protein and vice versa. Examples include some digestive enzyme genes, where variation in copy numbers and protein expression may be related to dietary differences. Increasing the expression of a digestive enzyme through higher gene copy numbers may thus be a potential mechanism for altering an organism’s digestive capabilities. I investigated copy number variation in genes coding for acidic mammalian chitinase, a chitinolytic digestive enzyme that may be used for the digestion of insect exoskeletons, in nonhuman primates with varying levels of insect consumption. I hypothesized that CHIA copy number correlates positively with level of insectivory, predicting higher copy numbers in more insectivorous primates. I assessed copy number variation with the QuantStudio 3D digital PCR platform, in a comparative sample of Old World and New World primate species (N = 10 species, one or two individuals each). Contrary to my prediction, no evidence of copy number variation was found and all species tested had two gene copies per diploid genome. These findings suggest more » that if acidic mammalian chitinase expression varies according to insect consumption in primates, it may be up- or downregulated through another mechanism. « less
Authors:
Award ID(s):
1650864
Publication Date:
NSF-PAR ID:
10099914
Journal Name:
International journal of primatology
ISSN:
1573-8604
Sponsoring Org:
National Science Foundation
More Like this
  1. INTRODUCTION Transposable elements (TEs), repeat expansions, and repeat-mediated structural rearrangements play key roles in chromosome structure and species evolution, contribute to human genetic variation, and substantially influence human health through copy number variants, structural variants, insertions, deletions, and alterations to gene transcription and splicing. Despite their formative role in genome stability, repetitive regions have been relegated to gaps and collapsed regions in human genome reference GRCh38 owing to the technological limitations during its development. The lack of linear sequence in these regions, particularly in centromeres, resulted in the inability to fully explore the repeat content of the human genome in the context of both local and regional chromosomal environments. RATIONALE Long-read sequencing supported the complete, telomere-to-telomere (T2T) assembly of the pseudo-haploid human cell line CHM13. This resource affords a genome-scale assessment of all human repetitive sequences, including TEs and previously unknown repeats and satellites, both within and outside of gaps and collapsed regions. Additionally, a complete genome enables the opportunity to explore the epigenetic and transcriptional profiles of these elements that are fundamental to our understanding of chromosome structure, function, and evolution. Comparative analyses reveal modes of repeat divergence, evolution, and expansion or contraction with locus-level resolution. RESULTS We implementedmore »a comprehensive repeat annotation workflow using previously known human repeats and de novo repeat modeling followed by manual curation, including assessing overlaps with gene annotations, segmental duplications, tandem repeats, and annotated repeats. Using this method, we developed an updated catalog of human repetitive sequences and refined previous repeat annotations. We discovered 43 previously unknown repeats and repeat variants and characterized 19 complex, composite repetitive structures, which often carry genes, across T2T-CHM13. Using precision nuclear run-on sequencing (PRO-seq) and CpG methylated sites generated from Oxford Nanopore Technologies long-read sequencing data, we assessed RNA polymerase engagement across retroelements genome-wide, revealing correlations between nascent transcription, sequence divergence, CpG density, and methylation. These analyses were extended to evaluate RNA polymerase occupancy for all repeats, including high-density satellite repeats that reside in previously inaccessible centromeric regions of all human chromosomes. Moreover, using both mapping-dependent and mapping-independent approaches across early developmental stages and a complete cell cycle time series, we found that engaged RNA polymerase across satellites is low; in contrast, TE transcription is abundant and serves as a boundary for changes in CpG methylation and centromere substructure. Together, these data reveal the dynamic relationship between transcriptionally active retroelement subclasses and DNA methylation, as well as potential mechanisms for the derivation and evolution of new repeat families and composite elements. Focusing on the emerging T2T-level assembly of the HG002 X chromosome, we reveal that a high level of repeat variation likely exists across the human population, including composite element copy numbers that affect gene copy number. Additionally, we highlight the impact of repeats on the structural diversity of the genome, revealing repeat expansions with extreme copy number differences between humans and primates while also providing high-confidence annotations of retroelement transduction events. CONCLUSION The comprehensive repeat annotations and updated repeat models described herein serve as a resource for expanding the compendium of human genome sequences and reveal the impact of specific repeats on the human genome. In developing this resource, we provide a methodological framework for assessing repeat variation within and between human genomes. The exhaustive assessment of the transcriptional landscape of repeats, at both the genome scale and locally, such as within centromeres, sets the stage for functional studies to disentangle the role transcription plays in the mechanisms essential for genome stability and chromosome segregation. Finally, our work demonstrates the need to increase efforts toward achieving T2T-level assemblies for nonhuman primates and other species to fully understand the complexity and impact of repeat-derived genomic innovations that define primate lineages, including humans. Telomere-to-telomere assembly of CHM13 supports repeat annotations and discoveries. The human reference T2T-CHM13 filled gaps and corrected collapsed regions (triangles) in GRCh38. Combining long read–based methylation calls, PRO-seq, and multilevel computational methods, we provide a compendium of human repeats, define retroelement expression and methylation profiles, and delineate locus-specific sites of nascent transcription genome-wide, including previously inaccessible centromeres. SINE, short interspersed element; SVA, SINE–variable number tandem repeat– Alu ; LINE, long interspersed element; LTR, long terminal repeat; TSS, transcription start site; pA, xxxxxxxxxxxxxxxx.« less
  2. Although cases of independent adaptation to the same dietary niche have been documented in mammalian ecology, the molecular correlates of such shifts are seldom known. Here, we used genomewide analyses of molecular evolution to examine two lineages of bats that, from an insectivorous ancestor, have both independently evolved obligate frugivory: the Old World family Pteropodidae and the neotropical subfamily Stenodermatinae. New genome assemblies from two neotropical fruit bats (Artibeus jamaicensis and Sturnira hondurensis) provide a framework for comparisons with Old World fruit bats. Comparative genomics of 10 bat species encompassing dietary diversity across the phylogeny revealed convergent molecular signatures of frugivory in both multigene family evolution and single‐copy genes. Evidence for convergent molecular adaptations associated with frugivorous diets includes the composition of three subfamilies of olfactory receptor genes, losses of three bitter taste receptor genes, losses of two digestive enzyme genes and convergent amino acid substitutions in several metabolic genes. By identifying suites of adaptations associated with the convergent evolution of frugivory, our analyses both reveal the extent of molecular mechanisms under selection in dietary shifts and will facilitate future studies of molecular ecology in mammals.
  3. Many mammals can digest starch by using an enzyme called amylase, but different species eat different amounts of starchy foods. Amylase is released by the pancreas, and in certain species such as humans, it is also created by the glands that produce saliva, allowing the enzyme to be present in the mouth. There, amylase can start to break down starch, releasing a sweet taste that helps the animal to detect starchy foods. Curiously, humans have multiple copies of the gene that codes for the enzyme, but the exact number varies between people. Previous research has found that populations with more copies also eat more starch; if this correlation also existed in other species, it could help to understand how diets influence and shape genetic information. In addition, it is unclear how amylase came to be present in saliva, as the ancestors of mammals only produced the protein in the pancreas. Pajic et al. analyzed the genomes of a range of mammals and found that the more starch a species had in its diet, the more amylase gene copies it harbored in its genome. In fact, unrelated mammals living in different habitats and eating different types of food have similar numbersmore »of amylase gene copies if they have the same level of starch in their diet. In addition, Pajic et al. discovered that animals such as mice, rats, pigs and dogs, which have lived in close contact with people for thousands of years, quickly adapted to the large amount of starch present in human food. In each of these species, a mechanism called gene duplication independently created new copies of the amylase gene. This could represent the first step towards some of these copies becoming active in the glands that release saliva. In people, having fewer copies of the amylase gene could mean they have a higher risk for diabetes; this number is also tied to the composition of the collection of bacteria that live in the mouth and the gut. Understanding how the copy number of the amylase gene affects biology will help to grasp how it also affects health and wellbeing, in humans and in our four-legged companions.« less
  4. Abstract Background

    Genome size is implicated in the form, function, and ecological success of a species. Two principally different mechanisms are proposed as major drivers of eukaryotic genome evolution and diversity: polyploidy (i.e., whole-genome duplication) or smaller duplication events and bursts in the activity of repetitive elements. Here, we generated de novo genome assemblies of 17 caddisflies covering all major lineages of Trichoptera. Using these and previously sequenced genomes, we use caddisflies as a model for understanding genome size evolution in diverse insect lineages.

    Results

    We detect a ∼14-fold variation in genome size across the order Trichoptera. We find strong evidence that repetitive element expansions, particularly those of transposable elements (TEs), are important drivers of large caddisfly genome sizes. Using an innovative method to examine TEs associated with universal single-copy orthologs (i.e., BUSCO genes), we find that TE expansions have a major impact on protein-coding gene regions, with TE-gene associations showing a linear relationship with increasing genome size. Intriguingly, we find that expanded genomes preferentially evolved in caddisfly clades with a higher ecological diversity (i.e., various feeding modes, diversification in variable, less stable environments).

    Conclusion

    Our findings provide a platform to test hypotheses about the potential evolutionary roles of TE activity and TE-genemore »associations, particularly in groups with high species, ecological, and functional diversities.

    « less
  5. Evolutionary transitions to a social lifestyle in insects are associated with lineage-specific changes in gene expression, but the key nodes that drive these regulatory changes are unknown. We examined the relationship between social organization and lineage-specific microRNAs (miRNAs). Genome scans across 12 bee species showed that miRNA copy-number is mostly conserved and not associated with sociality. However, deep sequencing of small RNAs in six bee species revealed a substantial proportion (20–35%) of detected miRNAs had lineage-specific expression in the brain, 24–72% of which did not have homologues in other species. Lineage-specific miRNAs disproportionately target lineage-specific genes, and have lower expression levels than shared miRNAs. The predicted targets of lineage-specific miRNAs are not enriched for genes with caste-biased expression or genes under positive selection in social species. Together, these results suggest that novel miRNAs may coevolve with novel genes, and thus contribute to lineage-specific patterns of evolution in bees, but do not appear to have significant influence on social evolution. Our analyses also support the hypothesis that many new miRNAs are purged by selection due to deleterious effects on mRNA targets, and suggest genome structure is not as influential in regulating bee miRNA evolution as has been shown for mammalian miRNAs.