skip to main content


Title: A predicted deleterious allele of the essential meiosis gene MND1, present in ~ 3% of East Asians, does not disrupt reproduction in mice
Abstract

Infertility is a major health problem affecting ~15% of couples worldwide. Except for cases involving readily detectable chromosome aberrations, confident identification of a causative genetic defect is problematic. Despite the advent of genome sequencing for diagnostic purposes, the preponderance of segregating genetic variants complicates identification of culprit genetic alleles or mutations. Many algorithms have been developed to predict the effects of ‘variants of unknown significance’, typically single nucleotide polymorphisms (SNPs), but these predictions are not sufficiently accurate for clinical action. As part of a project to identify population variants that impact fertility, we have been generating clustered regularly interspaced short palindromic repeats-Cas9 edited mouse models of suspect SNPs in genes that are known to be required for fertility in mice. Here, we present data on a non-synonymous (amino acid altering) SNP (rs140107488) in the meiosis gene Mnd1, which is predicted bioinformatically to be deleterious to protein function. We report that when modeled in mice, this allele (MND1K85M), which is present at an allele frequency of ~ 3% in East Asians, has no discernable effect upon fertility, fecundity or gametogenesis, although it may cause sex skewing of progeny from homozygous males. In sum, assuming the mouse model accurately reflects the impact of this variant in humans, rs140107488 appears to be a benign allele that can be eliminated or de-prioritized in clinical genomic analyses of infertility patients.

 
more » « less
NSF-PAR ID:
10121386
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Molecular Human Reproduction
ISSN:
1460-2407
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Eve, Alex (Ed.)
    Genetic analyses of mammalian gametogenesis and fertility have the potential to inform about two important and interrelated clinical areas: infertility and contraception. Here, we address the genetics and genomics underlying gamete formation, productivity and function in the context of reproductive success in mammalian systems, primarily mouse and human. Although much is known about the specific genes and proteins required for meiotic processes and sperm function, we know relatively little about other gametic determinants of overall fertility, such as regulation of gamete numbers, duration of gamete production, and gamete selection and function in fertilization. As fertility is not a binary trait, attention is now appropriately focused on the oligogenic, quantitative aspects of reproduction. Multiparent mouse populations, created by complex crossing strategies, exhibit genetic diversity similar to human populations and will be valuable resources for genetic discovery, helping to overcome current limitations to our knowledge of mammalian reproductive genetics. Finally, we discuss how what we know about the genomics of reproduction can ultimately be brought to the clinic, informing our concepts of human fertility and infertility, and improving assisted reproductive technologies. 
    more » « less
  2. INTRODUCTION Thousands of genetic variants have been associated with human diseases and traits through genome-wide association studies (GWASs). Translating these discoveries into improved therapeutics requires discerning which variants among hundreds of candidates are causally related to disease risk. To date, only a handful of causal variants have been confirmed. Here, we leverage 100 million years of mammalian evolution to address this major challenge. RATIONALE We compared genomes from hundreds of mammals and identified bases with unusually few variants (evolutionarily constrained). Constraint is a measure of functional importance that is agnostic to cell type or developmental stage. It can be applied to investigate any heritable disease or trait and is complementary to resources using cell type– and time point–specific functional assays like Encyclopedia of DNA Elements (ENCODE) and Genotype-Tissue Expression (GTEx). RESULTS Using constraint calculated across placental mammals, 3.3% of bases in the human genome are significantly constrained, including 57.6% of coding bases. Most constrained bases (80.7%) are noncoding. Common variants (allele frequency ≥ 5%) and low-frequency variants (0.5% ≤ allele frequency < 5%) are depleted for constrained bases (1.85 versus 3.26% expected by chance, P < 2.2 × 10 −308 ). Pathogenic ClinVar variants are more constrained than benign variants ( P < 2.2 × 10 −16 ). The most constrained common variants are more enriched for disease single-nucleotide polymorphism (SNP)–heritability in 63 independent GWASs. The enrichment of SNP-heritability in constrained regions is greater (7.8-fold) than previously reported in mammals and is even higher in primates (11.1-fold). It exceeds the enrichment of SNP-heritability in nonsynonymous coding variants (7.2-fold) and fine-mapped expression quantitative trait loci (eQTL)–SNPs (4.8-fold). The enrichment peaks near constrained bases, with a log-linear decrease of SNP-heritability enrichment as a function of the distance to a constrained base. Zoonomia constraint scores improve functionally informed fine-mapping. Variants at sites constrained in mammals and primates have greater posterior inclusion probabilities and higher per-SNP contributions. In addition, using both constraint and functional annotations improves polygenic risk score accuracy across a range of traits. Finally, incorporating constraint information into the analysis of noncoding somatic variants in medulloblastomas identifies new candidate driver genes. CONCLUSION Genome-wide measures of evolutionary constraint can help discern which variants are functionally important. This information may accelerate the translation of genomic discoveries into the biological, clinical, and therapeutic knowledge that is required to understand and treat human disease. Using evolutionary constraint in genomic studies of human diseases. ( A ) Constraint was calculated across 240 mammal species, including 43 primates (teal line). ( B ) Pathogenic ClinVar variants ( N = 73,885) are more constrained across mammals than benign variants ( N = 231,642; P < 2.2 × 10 −16 ). ( C ) More-constrained bases are more enriched for trait-associated variants (63 GWASs). ( D ) Enrichment of heritability is higher in constrained regions than in functional annotations (left), even in a joint model with 106 annotations (right). ( E ) Fine-mapping (PolyFun) using a model that includes constraint scores identifies an experimentally validated association at rs1421085. Error bars represent 95% confidence intervals. BMI, body mass index; LF, low frequency; PIP, posterior inclusion probability. 
    more » « less
  3. Abstract

    Genomically imprinted loci are expressed mono-allelically, dependent upon the parent of origin. Their regulation not only illuminates how chromatin regulates gene expression but also how chromatin can be reprogrammed every generation. Because of their distinct parent-of-origin regulation, analysis of imprinted loci can be difficult. Single nucleotide polymorphisms (SNPs) are required to accurately assess these elements allele specifically. However, publicly available SNP databases lack robust verification, making analysis of imprinting difficult. In addition, the allele-specific imprinting assays that have been developed employ different mouse strains, making it difficult to systemically analyze these loci. Here, we have generated a resource that will allow the allele-specific analysis of many significant imprinted loci in a single hybrid strain of Mus musculus. This resource includes verification of SNPs present within 10 of the most widely used imprinting control regions and allele-specific DNA methylation assays for each gene in a C57BL/6J and CAST/EiJ hybrid strain background.

     
    more » « less
  4. Wittkopp, Patricia (Ed.)
    Abstract Investigating closely related species that rapidly evolved divergent feeding morphology is a powerful approach to identify genetic variation underlying variation in complex traits. This can also lead to the discovery of novel candidate genes influencing natural and clinical variation in human craniofacial phenotypes. We combined whole-genome resequencing of 258 individuals with 50 transcriptomes to identify candidate cis-acting genetic variation underlying rapidly evolving craniofacial phenotypes within an adaptive radiation of Cyprinodon pupfishes. This radiation consists of a dietary generalist species and two derived trophic niche specialists—a molluscivore and a scale-eating species. Despite extensive morphological divergence, these species only diverged 10 kya and produce fertile hybrids in the laboratory. Out of 9.3 million genome-wide SNPs and 80,012 structural variants, we found very few alleles fixed between species—only 157 SNPs and 87 deletions. Comparing gene expression across 38 purebred F1 offspring sampled at three early developmental stages, we identified 17 fixed variants within 10 kb of 12 genes that were highly differentially expressed between species. By measuring allele-specific expression in F1 hybrids from multiple crosses, we found that the majority of expression divergence between species was explained by trans-regulatory mechanisms. We also found strong evidence for two cis-regulatory alleles affecting expression divergence of two genes with putative effects on skeletal development (dync2li1 and pycr3). These results suggest that SNPs and structural variants contribute to the evolution of novel traits and highlight the utility of the San Salvador Island pupfish system as an evolutionary model for craniofacial development. 
    more » « less
  5. Abstract Background

    Crop improvement through cross-population genomic prediction and genome editing requires identification of causal variants at high resolution, within fewer than hundreds of base pairs. Most genetic mapping studies have generally lacked such resolution. In contrast, evolutionary approaches can detect genetic effects at high resolution, but they are limited by shifting selection, missing data, and low depth of multiple-sequence alignments. Here we use genomic annotations to accurately predict nucleotide conservation across angiosperms, as a proxy for fitness effect of mutations.

    Results

    Using only sequence analysis, we annotate nonsynonymous mutations in 25,824 maize gene models, with information from bioinformatics and deep learning. Our predictions are validated by experimental information: within-species conservation, chromatin accessibility, and gene expression. According to gene ontology and pathway enrichment analyses, predicted nucleotide conservation points to genes in central carbon metabolism. Importantly, it improves genomic prediction for fitness-related traits such as grain yield, in elite maize panels, by stringent prioritization of fewer than 1% of single-site variants.

    Conclusions

    Our results suggest that predicting nucleotide conservation across angiosperms may effectively prioritize sites most likely to impact fitness-related traits in crops, without being limited by shifting selection, missing data, and low depth of multiple-sequence alignments. Our approach—Prediction of mutation Impact by Calibrated Nucleotide Conservation (PICNC)—could be useful to select polymorphisms for accurate genomic prediction, and candidate mutations for efficient base editing. The trained PICNC models and predicted nucleotide conservation at protein-coding SNPs in maize are publicly available in CyVerse (https://doi.org/10.25739/hybz-2957).

     
    more » « less