Abstract Mutations in human proteins lead to diseases. The structure of these proteins can help understand the mechanism of such diseases and develop therapeutics against them. With improved deep learning techniques, such as RoseTTAFold and AlphaFold, we can predict the structure of proteins even in the absence of structural homologs. We modeled and extracted the domains from 553 disease-associated human proteins without known protein structures or close homologs in the Protein Databank. We noticed that the model quality was higher and the Root mean square deviation (RMSD) lower between AlphaFold and RoseTTAFold models for domains that could be assigned to CATH families as compared to those which could only be assigned to Pfam families of unknown structure or could not be assigned to either. We predicted ligand-binding sites, protein–protein interfaces and conserved residues in these predicted structures. We then explored whether the disease-associated missense mutations were in the proximity of these predicted functional sites, whether they destabilized the protein structure based on ddG calculations or whether they were predicted to be pathogenic. We could explain 80% of these disease-associated mutations based on proximity to functional sites, structural destabilization or pathogenicity. When compared to polymorphisms, a larger percentage of disease-associated missense mutations were buried, closer to predicted functional sites, predicted as destabilizing and pathogenic. Usage of models from the two state-of-the-art techniques provide better confidence in our predictions, and we explain 93 additional mutations based on RoseTTAFold models which could not be explained based solely on AlphaFold models.
more »
« less
Prediction of the Effects of Missense Mutations on Human Myeloperoxidase Protein Stability Using In Silico Saturation Mutagenesis
Myeloperoxidase (MPO) is a heme peroxidase with microbicidal properties. MPO plays a role in the host’s innate immunity by producing reactive oxygen species inside the cell against foreign organisms. However, there is little functional evidence linking missense mutations to human diseases. We utilized in silico saturation mutagenesis to generate and analyze the effects of 10,811 potential missense mutations on MPO stability. Our results showed that ~71% of the potential missense mutations destabilize MPO, and ~8% stabilize the MPO protein. We showed that G402W, G402Y, G361W, G402F, and G655Y would have the highest destabilizing effect on MPO. Meanwhile, D264L, G501M, D264H, D264M, and G501L have the highest stabilization effect on the MPO protein. Our computational tool prediction showed the destabilizing effects in 13 out of 14 MPO missense mutations that cause diseases in humans. We also analyzed putative post-translational modification (PTM) sites on the MPO protein and mapped the PTM sites to disease-associated missense mutations for further analysis. Our analysis showed that R327H associated with frontotemporal dementia and R548W causing generalized pustular psoriasis are near these PTM sites. Our results will aid further research into MPO as a biomarker for human complex diseases and a candidate for drug target discovery.
more »
« less
- PAR ID:
- 10358834
- Date Published:
- Journal Name:
- Genes
- Volume:
- 13
- Issue:
- 8
- ISSN:
- 2073-4425
- Page Range / eLocation ID:
- 1412
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Background/Objectives: Somatic and genetic mutations in glutathione peroxidases (GPxs), including GPx7 and GPx8, have been linked to intellectual disability, microcephaly, and various tumors. GPx7 and GPx8 evolved the latest among the GPx enzymes and are present in the endoplasmic reticulum. Although lacking a glutathione binding domain, GPx7 and GPx8 possess peroxidase activity that helps the body respond to cellular stress. However, the protein mutations in these peroxidases remain relatively understudied. Methods: By elucidating the structural and stability consequences of missense mutations, this study aims to provide insights into the pathogenic mechanisms involved in different cancers, thereby aiding clinical diagnosis, treatment strategies, and the development of targeted therapies. We performed saturated computational mutagenesis to analyze 2926 and 3971 missense mutations of GPx7 and GPx8, respectively. Results: The results indicate that G153H and G153F in GPx7 are highly destabilizing, while E93M and W142F are stabilizing. In GPx8, N74W and G173W caused the most instability while S70I and S119P increased stability. Our analysis shows that highly destabilizing somatic and genetic mutations are more likely pathogenic compared to stabilizing mutations. Conclusions: This comprehensive analysis of missense mutations in GPx7 and GPx8 provides critical insights into their impact on protein structure and stability, contributing to a deeper understanding of the roles of somatic mutations in cancer development and progression. These findings can inform more precise clinical diagnostics and targeted treatment approaches for cancers.more » « less
-
Wallqvist, Anders (Ed.)Many pathogenic missense mutations are found in protein positions that are neither well-conserved nor fall in any known functional domains. Consequently, we lack any mechanistic underpinning of dysfunction caused by such mutations. We explored the disruption of allosteric dynamic coupling between these positions and the known functional sites as a possible mechanism for pathogenesis. In this study, we present an analysis of 591 pathogenic missense variants in 144 human enzymes that suggests that allosteric dynamic coupling of mutated positions with known active sites is a plausible biophysical mechanism and evidence of their functional importance. We illustrate this mechanism in a case study of β-Glucocerebrosidase (GCase) in which a vast majority of 94 sites harboring Gaucher disease-associated missense variants are located some distance away from the active site. An analysis of the conformational dynamics of GCase suggests that mutations on these distal sites cause changes in the flexibility of active site residues despite their distance, indicating a dynamic communication network throughout the protein. The disruption of the long-distance dynamic coupling caused by missense mutations may provide a plausible general mechanistic explanation for biological dysfunction and disease.more » « less
-
null (Ed.)Abstract The spike (S) glycoprotein of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the binding to the permissive cells. The receptor-binding domain (RBD) of SARS-CoV-2 S protein directly interacts with the human angiotensin-converting enzyme 2 (ACE2) on the host cell membrane. In this study, we used computational saturation mutagenesis approaches, including structure-based energy calculations and sequence-based pathogenicity predictions, to quantify the systemic effects of missense mutations on SARS-CoV-2 S protein structure and function. A total of 18 354 mutations in S protein were analyzed, and we discovered that most of these mutations could destabilize the entire S protein and its RBD. Specifically, residues G431 and S514 in SARS-CoV-2 RBD are important for S protein stability. We analyzed 384 experimentally verified S missense variations and revealed that the dominant pandemic form, D614G, can stabilize the entire S protein. Moreover, many mutations in N-linked glycosylation sites can increase the stability of the S protein. In addition, we investigated 3705 mutations in SARS-CoV-2 RBD and 11 324 mutations in human ACE2 and found that SARS-CoV-2 neighbor residues G496 and F497 and ACE2 residues D355 and Y41 are critical for the RBD–ACE2 interaction. The findings comprehensively provide potential target sites in the development of drugs and vaccines against COVID-19.more » « less
-
Severe Acute respiratory syndrome coronavirus (SARS-CoV-1) attaches to the host cell surface to initiate the interaction between the receptor-binding domain (RBD) of its spike glycoprotein (S) and the human Angiotensin-converting enzyme (hACE2) receptor. SARS-CoV-1 mutates frequently because of its RNA genome, which challenges the antiviral development. Here, we per-formed computational saturation mutagenesis of the S protein of SARS-CoV-1 to identify the residues crucial for its functions. We used the structure-based energy calculations to analyze the effects of the missense mutations on the SARS-CoV-1 S stability and the binding affinity with hACE2. The sequence and structure alignment showed similarities between the S proteins of SARS-CoV-1 and SARS-CoV-2. Interestingly, we found that target mutations of S protein amino acids generate similar effects on their stabilities between SARS-CoV-1 and SARS-CoV-2. For example, G839W of SARS-CoV-1 corresponds to G857W of SARS-CoV-2, which decrease the stability of their S glycoproteins. The viral mutation analysis of the two different SARS-CoV-1 isolates showed that mutations, T487S and L472P, weakened the S-hACE2 binding of the 2003–2004 SARS-CoV-1 isolate. In addition, the mutations of L472P and F360S destabilized the 2003–2004 viral isolate. We further predicted that many mutations on N-linked glycosylation sites would increase the stability of the S glycoprotein. Our results can be of therapeutic importance in the design of antivirals or vaccines against SARS-CoV-1 and SARS-CoV-2.more » « less
An official website of the United States government

