skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: An evolving view of copy number variants
Copy number variants (CNVs) are regions of the genome that vary in integer copy number. CNVs, which comprise both amplifications and deletions of DNA sequence, have been identified across all domains of life, from bacteria and archaea to plants and animals. CNVs are an important source of genetic diversity, and can drive rapid adaptive evolution and progression of heritable and somatic human diseases, such as cancer. However, despite their evolutionary importance and clinical relevance, CNVs remain understudied compared to single-nucleotide variants (SNVs). This is a consequence of the inherent difficulties in detecting CNVs at low-to-intermediate frequencies in heterogeneous populations of cells. Here, we discuss molecular methods used to detect CNVs, the limitations associated with using these techniques, and the application of new and emerging technologies that present solutions to these challenges. The goal of this short review and perspective is to highlight aspects of CNV biology that are understudied and define avenues for further research that address specific gaps in our knowledge of these complex alleles. We describe our recently developed method for CNV detection in which a fluorescent gene functions as a single-cell CNV reporter and present key findings from our evolution experiments in Saccharomyces cerevisiae. Using a CNV reporter, we found that CNVs are generated at a high rate and undergo selection with predictable dynamics across independently evolving replicate populations. Many CNVs appear to be generated through DNA replication-based processes that are mediated by the presence of short, interrupted, inverted-repeat sequences. Our results have important implications for the role of CNVs in evolutionary processes and the molecular mechanisms that underlie CNV formation. We discuss the possible extension of our method to other applications, including tracking the dynamics of CNVs in models of human tumors.  more » « less
Award ID(s):
1818234
PAR ID:
10114991
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Current Genetics
ISSN:
0172-8083
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. de Visser, J. Arjan (Ed.)
    The rate of adaptive evolution depends on the rate at which beneficial mutations are introduced into a population and the fitness effects of those mutations. The rate of beneficial mutations and their expected fitness effects is often difficult to empirically quantify. As these 2 parameters determine the pace of evolutionary change in a population, the dynamics of adaptive evolution may enable inference of their values. Copy number variants (CNVs) are a pervasive source of heritable variation that can facilitate rapid adaptive evolution. Previously, we developed a locus-specific fluorescent CNV reporter to quantify CNV dynamics in evolving populations maintained in nutrient-limiting conditions using chemostats. Here, we use CNV adaptation dynamics to estimate the rate at which beneficial CNVs are introduced through de novo mutation and their fitness effects using simulation-based likelihood–free inference approaches. We tested the suitability of 2 evolutionary models: a standard Wright–Fisher model and a chemostat model. We evaluated 2 likelihood-free inference algorithms: the well-established Approximate Bayesian Computation with Sequential Monte Carlo (ABC-SMC) algorithm, and the recently developed Neural Posterior Estimation (NPE) algorithm, which applies an artificial neural network to directly estimate the posterior distribution. By systematically evaluating the suitability of different inference methods and models, we show that NPE has several advantages over ABC-SMC and that a Wright–Fisher evolutionary model suffices in most cases. Using our validated inference framework, we estimate the CNV formation rate at the GAP1 locus in the yeast Saccharomyces cerevisiae to be 10 −4.7 to 10 −4 CNVs per cell division and a fitness coefficient of 0.04 to 0.1 per generation for GAP1 CNVs in glutamine-limited chemostats. We experimentally validated our inference-based estimates using 2 distinct experimental methods—barcode lineage tracking and pairwise fitness assays—which provide independent confirmation of the accuracy of our approach. Our results are consistent with a beneficial CNV supply rate that is 10-fold greater than the estimated rates of beneficial single-nucleotide mutations, explaining the outsized importance of CNVs in rapid adaptive evolution. More generally, our study demonstrates the utility of novel neural network–based likelihood–free inference methods for inferring the rates and effects of evolutionary processes from empirical data with possible applications ranging from tumor to viral evolution. 
    more » « less
  2. Abstract Duplicated genes provide the opportunity for evolutionary novelty and adaptive divergence. In many cases, having more gene copies increases gene expression, which might facilitate adaptation to stressful or novel environments. Conversely, overexpression or misexpression of duplicated genes can be detrimental and subject to negative selection. In this scenario, newly duplicate genes may evade purifying selection if they are epigenetically silenced, at least temporarily, leading them to persist in populations as copy number variations (CNVs). In animals and plants, younger gene duplicates tend to have higher levels of DNA methylation and lower levels of gene expression, suggesting epigenetic regulation could promote the retention of gene duplications via expression repression or silencing. Here, we test the hypothesis that DNA methylation variation coincides with young duplicate genes that are segregating as CNVs in six populations of the three‐spined stickleback that span a salinity gradient from 4 to 30 PSU. Using reduced‐representation bisulfite sequencing, we found DNA methylation and CNV differentiation outliers rarely overlapped. Whereas lineage‐specific genes and young duplicates were found to be highly methylated, just two gene CNVs showed a significant association between promoter methylation level and copy number, suggesting that DNA methylation might not interact with CNVs in our dataset. If most new duplications are regulated for dosage by epigenetic mechanisms, our results do not support a strong contribution from DNA methylation soon after duplication. Instead, our results are consistent with a preference to duplicate genes that are already highly methylated. 
    more » « less
  3. Abstract Inverted duplicated DNA sequences are a common feature of structural variants (SVs) and copy number variants (CNVs). Analysis of CNVs containing inverted duplicated DNA sequences using nanopore sequencing identified recurrent aberrant behavior characterized by low confidence, incorrect and missed base calls. Inverted duplicate DNA sequences in both yeast and human samples were observed to have systematic elevation in the electrical current detected at the nanopore, increased translocation rates and decreased sampling rates. The coincidence of inverted duplicated DNA sequences with dramatically reduced sequencing accuracy and an increased translocation rate suggests that secondary DNA structures may interfere with the dynamics of transit of the DNA through the nanopore. 
    more » « less
  4. In recent years, obesity has reached epidemic proportions globally and has become a major public health concern. The development of obesity is likely caused by several behavioral, environmental, and genetic factors. Genomic variability among individuals is largely due to copy number variations (CNVs). Recent genome-wide association studies (GWAS) have successfully identified many loci containing CNV related to obesity. These obesity-related CNVs are informative to the diagnosis and treatment of genomic diseases. A more comprehensive classification of CNVs may provide the basis for determining how genomic diversity impacts the mechanisms of expression for obesity in children and adults of a variety of genders and ethnicities. In this review, we summarize current knowledge on the relationship between obesity and the CNV of several genomic regions, with an emphasis on genes at the following loci: 11q11, 1p21.1, 10q11.22, 10q26.3, 16q12.2, 16p12.3, and 4q25. 
    more » « less
  5. Abstract Phased genomes and pangenomes are enhancing our understanding of genetic variation. Accurate phasing and assembly in repetitive regions of the genome remain challenging, however. Addressing this obstacle is crucial for studying structural genomic variation, such as copy number variations (CNVs) common to repetitive regions. Polar fishes, for example, evolved repetitive tandem arrays of antifreeze protein (AFP) genes that facilitate adaptation to freezing and expanded in copy number in colder environments. AFP CNVs remain poorly characterized in polar fishes and may be illuminated by haplotype-aware approaches. We performed long-read sequencing for two polar fishes in the suborder Zoarcoidei and leveraged additional published long-read data to assemble phased genomes. We developed a workflow to measure haplotype diversity in CNV while controlling for misassembly and switch errors—a change from one parental haplotype to another in a contiguous assembly. We presentgfa_parser, which computes and extracts all possible contiguous sequences for phased or primary assemblies from graphical fragment assembly (GFA) files, andswitch_error_screen, which flags potential switch errors.gfa_parserrevealed that assembly uncertainty was ubiquitous across AFP array haplotypes and that standard processing of graphical fragment assemblies can bias measurement of haplotype CNVs. We detected no switch errors in AFP arrays. After controlling for misassembly and switch error, we detected haplotype diversity of AFP CNVs in all studied polar Zoarcoidei species and in 60% of AFP arrays. Intraindividual haplotype diversity spanned differences of 3–16 copies. Our workflow revealed intraspecific genomic diversity in zoarcoids that likely fueled the evolution of AFP copy number across temperature. 
    more » « less