skip to main content


Title: ClinVar and HGMD genomic variant classification accuracy has improved over time, as measured by implied disease burden
Abstract Background

Curated databases of genetic variants assist clinicians and researchers in interpreting genetic variation. Yet, these databases contain some misclassified variants. It is unclear whether variant misclassification is abating as these databases rapidly grow and implement new guidelines.

Methods

Using archives of ClinVar and HGMD, we investigated how variant misclassification has changed over 6 years, across different ancestry groups. We considered inborn errors of metabolism (IEMs) screened in newborns as a model system because these disorders are often highly penetrant with neonatal phenotypes. We used samples from the 1000 Genomes Project (1KGP) to identify individuals with genotypes that were classified by the databases as pathogenic. Due to the rarity of IEMs, nearly all such classified pathogenic genotypes indicate likely variant misclassification in ClinVar or HGMD.

Results

While the false-positive rates of both ClinVar and HGMD have improved over time, HGMD variants currently imply two orders of magnitude more affected individuals in 1KGP than ClinVar variants. We observed that African ancestry individuals have a significantly increased chance of being incorrectly indicated to be affected by a screened IEM when HGMD variants are used. However, this bias affecting genomes of African ancestry was no longer significant once common variants were removed in accordance with recent variant classification guidelines. We discovered that ClinVar variants classified as Pathogenic or Likely Pathogenic are reclassified sixfold more often than DM or DM? variants in HGMD, which has likely resulted in ClinVar’s lower false-positive rate.

Conclusions

Considering misclassified variants that have since been reclassified reveals our increasing understanding of rare genetic variation. We found that variant classification guidelines and allele frequency databases comprising genetically diverse samples are important factors in reclassification. We also discovered that ClinVar variants common in European and South Asian individuals were more likely to be reclassified to a lower confidence category, perhaps due to an increased chance of these variants being classified by multiple submitters. We discuss features for variant classification databases that would support their continued improvement.

 
more » « less
NSF-PAR ID:
10431579
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
Genome Medicine
Volume:
15
Issue:
1
ISSN:
1756-994X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Background

    The goal of this study is to evaluate germline genetic variants in African American men with metastatic prostate cancer as compared to those in Caucasian men with metastatic prostate cancer in an effort to understand the role of genetic factors in these populations.

    Methods

    African American and Caucasian men with metastatic prostate cancer who had germline testing using multigene panels were used to generate comparisons. Germline genetic results, clinical parameters, and family histories between the two populations were analyzed.

    Results

    A total of 867 patients were included in this retrospective study, including 188 African American and 669 Caucasian patients. There was no significant difference in the likelihood of a pathogenic or likely‐pathogenic variants (PV/LPVs) between African American and Caucasian patients (p = .09). African American patients were more likely to have a variant of unknown significance than Caucasians (odds ratio [OR] = 1.95;p < .0001). BRCA1 PV/LPVs were higher in African Americans (OR = 4.86;p = .04). African American patients were less likely to have a PV/LPV in non‐BRCA DNA repair genes (OR = 0.30;p = .008). Family history of breast (OR = 2.09;p = .002) or ovarian cancer (OR = 2.33;p = .04) predicted PV/LPVs in Caucasians but not African‐Americans. This underscores the limitations of family history in AA men and the importance of personal history to guide germline testing in AA men.

    Conclusions

    In metastatic prostate cancer patients, PV/LPVs of tested genes did not vary by race, BRCA1 PV/LPVs were more common in the African American subset. However, PV/LPVs in non‐BRCA DNA repair genes were less likely to be encountered in African Americans. Family history associated with genetic testing results in Caucasians only.

     
    more » « less
  2. Summary Objective

    Copy number variations (CNVs) represent a significant genetic risk for several neurodevelopmental disorders including epilepsy. As knowledge increases, reanalysis of existing data is essential. Reliable estimates of the contribution ofCNVs to epilepsies from sizeable populations are not available.

    Methods

    We assembled a cohort of 1255 patients with preexisting array comparative genomic hybridization or single nucleotide polymorphism array basedCNVdata. All patients had “epilepsy plus,” defined as epilepsy with comorbid features, including intellectual disability, psychiatric symptoms, and other neurological and nonneurological features.CNVclassification was conducted using a systematic filtering workflow adapted to epilepsy.

    Results

    Of 1097 patients remaining after genetic data quality control, 120 individuals (10.9%) carried at least one autosomalCNVclassified as pathogenic; 19 individuals (1.7%) carried at least one autosomalCNVclassified as possibly pathogenic. Eleven patients (1%) carried more than one (possibly) pathogenicCNV. We identifiedCNVs covering recently reported (HNRNPU)or emerging (RORB) epilepsy genes, and further delineated the phenotype associated with mutations of these genes. Additional novel epilepsy candidate genes emerge from our study. Comparing phenotypic features of pathogenicCNVcarriers to those of noncarriers of pathogenicCNVs, we show that patients with nonneurological comorbidities, especially dysmorphism, were more likely to carry pathogenicCNVs (odds ratio = 4.09, confidence interval = 2.51‐6.68;P = 2.34 × 10−9). Meta‐analysis including data from published control groups showed that the presence or absence of epilepsy did not affect the detected frequency ofCNVs.

    Significance

    The use of a specifically adapted workflow enabled identification of pathogenic autosomalCNVs in 10.9% of patients with epilepsy plus, which rose to 12.7% when we also considered possibly pathogenicCNVs. Our data indicate that epilepsy with comorbid features should be considered an indication for patients to be selected for a diagnostic algorithm includingCNVdetection. Collaborative large‐scaleCNVreanalysis leads to novel declaration of pathogenicity in unexplained cases and can promote discovery of promising candidate epilepsy genes.

     
    more » « less
  3. null (Ed.)
    Abstract Thus far immunotherapy has had limited impact on ovarian cancer. Vigil (a novel DNA-based multifunctional immune-therapeutic) has shown clinical benefit to prolong relapse-free survival (RFS) and overall survival (OS) in the BRCA wild type and HRP populations. We further analyzed molecular signals related to sensitivity of Vigil treatment. Tissue from patients enrolled in the randomized double-blind trial of Vigil vs. placebo as maintenance in frontline management of advanced resectable ovarian cancer underwent DNA polymorphism analysis. Data was generated from a 981 gene panel to determine the tumor mutation burden and classify variants using Ingenuity Variant Analysis software (Qiagen) or NIH ClinVar. Only variants classified as pathogenic or likely pathogenic were included. STRING application (version 1.5.1) was used to create a protein-protein interaction network. Topological distance and probability of co-mutation were used to calculated the C-score and cumulative C-score (cumC-score). Kaplan–Meier analysis was used to determine the relationship between gene pairs with a high cumC-score and clinical parameters. Improved relapse free survival in Vigil treated patients was found for the TP53 m- BRCA wt-HRP group compared to placebo (21.1 months versus 5.6 months p  = 0.0013). Analysis of tumor mutation burden did not reveal statistical benefit in patients receiving Vigil versus placebo. Results suggest a subset of ovarian cancer patients with enhanced susceptibility to Vigil immunotherapy. The hypothesis-generating data presented invites a validation study of Vigil in target identified populations, and supports clinical consideration of STRING-generated network application to biomarker characterization with other cancer patients targeted with Vigil. 
    more » « less
  4. INTRODUCTION To faithfully distribute genetic material to daughter cells during cell division, spindle fibers must couple to DNA by means of a structure called the kinetochore, which assembles at each chromosome’s centromere. Human centromeres are located within large arrays of tandemly repeated DNA sequences known as alpha satellite (αSat), which often span millions of base pairs on each chromosome. Arrays of αSat are frequently surrounded by other types of tandem satellite repeats, which have poorly understood functions, along with nonrepetitive sequences, including transcribed genes. Previous genome sequencing efforts have been unable to generate complete assemblies of satellite-rich regions because of their scale and repetitive nature, limiting the ability to study their organization, variation, and function. RATIONALE Pericentromeric and centromeric (peri/centromeric) satellite DNA sequences have remained almost entirely missing from the assembled human reference genome for the past 20 years. Using a complete, telomere-to-telomere (T2T) assembly of a human genome, we developed and deployed tailored computational approaches to reveal the organization and evolutionary patterns of these satellite arrays at both large and small length scales. We also performed experiments to map precisely which αSat repeats interact with kinetochore proteins. Last, we compared peri/centromeric regions among multiple individuals to understand how these sequences vary across diverse genetic backgrounds. RESULTS Satellite repeats constitute 6.2% of the T2T-CHM13 genome assembly, with αSat representing the single largest component (2.8% of the genome). By studying the sequence relationships of αSat repeats in detail across each centromere, we found genome-wide evidence that human centromeres evolve through “layered expansions.” Specifically, distinct repetitive variants arise within each centromeric region and expand through mechanisms that resemble successive tandem duplications, whereas older flanking sequences shrink and diverge over time. We also revealed that the most recently expanded repeats within each αSat array are more likely to interact with the inner kinetochore protein Centromere Protein A (CENP-A), which coincides with regions of reduced CpG methylation. This suggests a strong relationship between local satellite repeat expansion, kinetochore positioning, and DNA hypomethylation. Furthermore, we uncovered large and unexpected structural rearrangements that affect multiple satellite repeat types, including active centromeric αSat arrays. Last, by comparing sequence information from nearly 1600 individuals’ X chromosomes, we observed that individuals with recent African ancestry possess the greatest genetic diversity in the region surrounding the centromere, which sometimes contains a predominantly African αSat sequence variant. CONCLUSION The genetic and epigenetic properties of centromeres are closely interwoven through evolution. These findings raise important questions about the specific molecular mechanisms responsible for the relationship between inner kinetochore proteins, DNA hypomethylation, and layered αSat expansions. Even more questions remain about the function and evolution of non-αSat repeats. To begin answering these questions, we have produced a comprehensive encyclopedia of peri/centromeric sequences in a human genome, and we demonstrated how these regions can be studied with modern genomic tools. Our work also illuminates the rich genetic variation hidden within these formerly missing regions of the genome, which may contribute to health and disease. This unexplored variation underlines the need for more T2T human genome assemblies from genetically diverse individuals. Gapless assemblies illuminate centromere evolution. ( Top ) The organization of peri/centromeric satellite repeats. ( Bottom left ) A schematic portraying (i) evidence for centromere evolution through layered expansions and (ii) the localization of inner-kinetochore proteins in the youngest, most recently expanded repeats, which coincide with a region of DNA hypomethylation. ( Bottom right ) An illustration of the global distribution of chrX centromere haplotypes, showing increased diversity in populations with recent African ancestry. 
    more » « less
  5. Abstract Background

    Three genes clustered together on chromosome 12 comprise a group of hydroxycarboxylic acid receptors (HCARs):HCAR1,HCAR2, andHCAR3. These paralogous genes encode different G-protein coupled receptors responsible for detecting glycolytic metabolites and controlling fatty acid oxidation. Though better known for regulating lipid metabolism in adipocytes, more recently, HCARs have been functionally associated with breast cancer proliferation/survival;HCAR2has been described as a tumor suppressor andHCAR1andHCAR3as oncogenes. Thus, we sought to identify germline variants inHCAR1,HCAR2,andHCAR3that could potentially be associated with breast cancer risk.

    Methods

    Two different cohorts of breast cancer cases were investigated, the Alabama Hereditary Cancer Cohort and The Cancer Genome Atlas, which were analyzed through nested PCRs/Sanger sequencing and whole-exome sequencing, respectively. All datasets were screened for rare, non-synonymous coding variants.

    Results

    Variants were identified in both breast cancer cohorts, some of which appeared to be associated with breast cancer BC risk, includingHCAR1c.58C > G (p.P20A),HCAR2c.424C > T (p.R142W),HCAR2c.517_518delinsAC (p.G173T),HCAR2c.1036A > G (p.M346V),HCAR2c.1086_1090del (p.P363Nfs*26),HCAR3c.560G > A (p.R187Q), andHCAR3c.1117delC (p.Q373Kfs*82). Additionally,HCAR2c.515C > T (p.S172L), a previously identified loss-of-function variant, was identified.

    Conclusions

    Due to the important role of HCARs in breast cancer, it is vital to understand how these genetic variants play a role in breast cancer risk and proliferation and their consequences on treatment strategies. Additional studies will be needed to validate these findings. Nevertheless, the identification of these potentially pathogenic variants supports the need to investigate their functional consequences.

     
    more » « less