Abstract Hearing loss is the leading sensory deficit, affecting ~ 5% of the population. It exhibits remarkable heterogeneity across 223 genes with 6328 pathogenic missense variants, making deafness-specific expertise a prerequisite for ascribing phenotypic consequences to genetic variants. Deafness-implicated variants are curated in the Deafness Variation Database (DVD) after classification by a genetic hearing loss expert panel and thorough informatics pipeline. However, seventy percent of the 128,167 missense variants in the DVD are “variants of uncertain significance” (VUS) due to insufficient evidence for classification. Here, we use the deep learning protein prediction algorithm, AlphaFold2, to curate structures for all DVD genes. We refine these structures with global optimization and the AMOEBA force field and use DDGun3D to predict folding free energy differences (∆∆GFold) for all DVD missense variants. We find that 5772 VUSs have a large, destabilizing ∆∆GFoldthat is consistent with pathogenic variants. When also filtered for CADD scores (> 25.7), we determine 3456 VUSs are likely pathogenic at a probability of 99.0%. Of the 224 genes in the DVD, 166 genes (74%) exhibit one or more missense variants predicted to cause a pathogenic change in protein folding stability. The VUSs prioritized here affect 119 patients (~ 3% of cases) sequenced by the OtoSCOPE targeted panel. Approximately half of these patients previously received an inconclusive report, and reclassification of these VUSs as pathogenic provides a new genetic diagnosis for six patients.
more »
« less
This content will become publicly available on December 1, 2025
Genomic Landscape of Chromosome X Factor VIII: From Hemophilia A in Males to Risk Variants in Females
Background: Variants within factor VIII (F8) are associated with sex-linked hemophilia A and thrombosis, with gene therapy approaches being available for pathogenic variants. Many variants within F8 remain variants of uncertain significance (VUS) or are under-explored as to their connections to phenotypic outcomes. Methods: We assessed data on F8 expression while screening the UniProt, ClinVar, Geno2MP, and gnomAD databases for F8 missense variants; these collectively represent the sequencing of more than a million individuals. Results: For the two F8 isoforms coding for different protein lengths (2351 and 216 amino acids), we observed noncoding variants influencing expression which are also associated with thrombosis risk, with uncertainty as to differences in females and males. Variant analysis identified a severe stratification of potential annotation issues for missense variants in subjects of non-European ancestry, suggesting a need for further defining the genetics of diverse populations. Additionally, few heterozygous female carriers of known pathogenic variants have sufficiently confident phenotyping data, leaving researchers unable to determine subtle, less defined phenotypes. Using structure movement correlations to known pathogenic variants for the VUS, we determined seven clusters of likely pathogenic variants based on screening work. Conclusions: This work highlights the need to define missense variants, especially those for VUS and from subjects of non-European ancestry, as well as the roles of these variants in women’s physiology.
more »
« less
- Award ID(s):
- 2120918
- PAR ID:
- 10614651
- Publisher / Repository:
- Multidisciplinary Digital Publishing Institute
- Date Published:
- Journal Name:
- Genes
- Volume:
- 15
- Issue:
- 12
- ISSN:
- 2073-4425
- Page Range / eLocation ID:
- 1522
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Wallqvist, Anders (Ed.)Many pathogenic missense mutations are found in protein positions that are neither well-conserved nor fall in any known functional domains. Consequently, we lack any mechanistic underpinning of dysfunction caused by such mutations. We explored the disruption of allosteric dynamic coupling between these positions and the known functional sites as a possible mechanism for pathogenesis. In this study, we present an analysis of 591 pathogenic missense variants in 144 human enzymes that suggests that allosteric dynamic coupling of mutated positions with known active sites is a plausible biophysical mechanism and evidence of their functional importance. We illustrate this mechanism in a case study of β-Glucocerebrosidase (GCase) in which a vast majority of 94 sites harboring Gaucher disease-associated missense variants are located some distance away from the active site. An analysis of the conformational dynamics of GCase suggests that mutations on these distal sites cause changes in the flexibility of active site residues despite their distance, indicating a dynamic communication network throughout the protein. The disruption of the long-distance dynamic coupling caused by missense mutations may provide a plausible general mechanistic explanation for biological dysfunction and disease.more » « less
-
Abstract BackgroundCurated databases of genetic variants assist clinicians and researchers in interpreting genetic variation. Yet, these databases contain some misclassified variants. It is unclear whether variant misclassification is abating as these databases rapidly grow and implement new guidelines. MethodsUsing archives of ClinVar and HGMD, we investigated how variant misclassification has changed over 6 years, across different ancestry groups. We considered inborn errors of metabolism (IEMs) screened in newborns as a model system because these disorders are often highly penetrant with neonatal phenotypes. We used samples from the 1000 Genomes Project (1KGP) to identify individuals with genotypes that were classified by the databases as pathogenic. Due to the rarity of IEMs, nearly all such classified pathogenic genotypes indicate likely variant misclassification in ClinVar or HGMD. ResultsWhile the false-positive rates of both ClinVar and HGMD have improved over time, HGMD variants currently imply two orders of magnitude more affected individuals in 1KGP than ClinVar variants. We observed that African ancestry individuals have a significantly increased chance of being incorrectly indicated to be affected by a screened IEM when HGMD variants are used. However, this bias affecting genomes of African ancestry was no longer significant once common variants were removed in accordance with recent variant classification guidelines. We discovered that ClinVar variants classified as Pathogenic or Likely Pathogenic are reclassified sixfold more often than DM or DM? variants in HGMD, which has likely resulted in ClinVar’s lower false-positive rate. ConclusionsConsidering misclassified variants that have since been reclassified reveals our increasing understanding of rare genetic variation. We found that variant classification guidelines and allele frequency databases comprising genetically diverse samples are important factors in reclassification. We also discovered that ClinVar variants common in European and South Asian individuals were more likely to be reclassified to a lower confidence category, perhaps due to an increased chance of these variants being classified by multiple submitters. We discuss features for variant classification databases that would support their continued improvement.more » « less
-
Genetic determinants of global developmental delay and intellectual disability in Ukrainian childrenAbstract BackgroundGlobal developmental delay or intellectual disability usually accompanies various genetic disorders as a part of the syndrome, which may include seizures, autism spectrum disorder and multiple congenital abnormalities. Next-generation sequencing (NGS) techniques have improved the identification of pathogenic variants and genes related to developmental delay. This study aimed to evaluate the yield of whole exome sequencing (WES) and neurodevelopmental disorder gene panel sequencing in a pediatric cohort from Ukraine. Additionally, the study computationally predicted the effect of variants of uncertain significance (VUS) based on recently published genetic data from the country’s healthy population. MethodsThe study retrospectively analyzed WES or gene panel sequencing findings of 417 children with global developmental delay, intellectual disability, and/or other symptoms. Variants of uncertain significance were annotated using CADD-Phred and SIFT prediction scores, and their frequency in the healthy population of Ukraine was estimated. ResultsA definitive molecular diagnosis was established in 66 (15.8%) of the individuals. WES diagnosed 22 out of 37 cases (59.4%), while the neurodevelopmental gene panel identified 44 definitive diagnoses among the 380 tested patients (12.1%). Non-diagnostic findings (VUS and carrier) were reported in 350 (83.2%) individuals. The most frequently diagnosed conditions were developmental and epileptic encephalopathies associated with severe epilepsy and GDD/ID (associated genesARX, CDKL5, STXBP1, KCNQ2, SCN2A, KCNT1, KCNA2). Additionally, we annotated 221 VUS classified as potentially damaging, AD or X-linked, potentially increasing the diagnostic yield by 30%, but 18 of these variants were present in the healthy population of Ukraine. ConclusionsThis is the first comprehensive study on genetic causes of GDD/ID conducted in Ukraine. This study provides the first comprehensive investigation of the genetic causes of GDD/ID in Ukraine. It presents a substantial dataset of diagnosed genetic conditions associated with GDD/ID. The results support the utilization of NGS gene panels and WES as first-line diagnostic tools for GDD/ID cases, particularly in resource-limited settings. A comprehensive approach to resolving VUS, including computational effect prediction, population frequency analysis, and phenotype assessment, can aid in further reclassification of deleterious VUS and guide further testing in families.more » « less
-
Abstract The Biorepository and Integrative Genomics (BIG) Initiative in Tennessee has developed a pioneering resource to address gaps in genomic research by linking genomic, phenotypic, and environmental data from a diverse Mid-South population, including underrepresented groups. We analyzed 13,152 exomes from BIG and found significant genetic diversity, with 50% of participants inferred to have non-European or several types of admixed ancestry. Ancestry within the BIG cohort is stratified, with distinct geographic and demographic patterns, as African ancestry is more common in urban areas, while European ancestry is more common in suburban regions. We observe ancestry-specific rates of novel genetic variants, which are enriched for functional or clinical relevance. Disease prevalence analysis linked ancestry and environmental factors, showing higher odds ratios for asthma and obesity in minority groups, particularly in the urban area. Finally, we observe discrepancies between self-reported race and genetic ancestry, with related individuals self-identifying in differing racial categories. These findings underscore the limitations of race as a biomedical variable. BIG has proven to be an effective model for community-centered precision medicine. We integrated genomics education, and fostered great trust among the contributing communities. Future goals include cohort expansion, and enhanced genomic analysis, to ensure equitable healthcare outcomes.more » « less
An official website of the United States government
