Abstract In oriented‐sample (OS) solid‐state NMR of membrane proteins, the angular‐dependent dipolar couplings and chemical shifts provide a direct input for structure calculations. However, so far only1H–15N dipolar couplings and15N chemical shifts have been routinely assessed in oriented15N‐labeled samples. The main obstacle for extending this technique to membrane proteins of arbitrary topology has remained in the lack of additional experimental restraints. We have developed a new experimental triple‐resonance NMR technique, which was applied to uniformly doubly (15N,13C)‐labeled Pf1 coat protein in magnetically aligned DMPC/DHPC bicelles. The previously inaccessible1Hα–13Cαdipolar couplings have been measured, which make it possible to determine the torsion angles between the peptide planes without assuming α‐helical structure a priori. The fitting of three angular restraints per peptide plane and filtering by Rosetta scoring functions has yielded a consensus α‐helical transmembrane structure for Pf1 protein.
more »
« less
Validated determination of NRG1 Ig-like domain structure by mass spectrometry coupled with computational modeling
Abstract High resolution hydroxyl radical protein footprinting (HR-HRPF) is a mass spectrometry-based method that measures the solvent exposure of multiple amino acids in a single experiment, offering constraints for experimentally informed computational modeling. HR-HRPF-based modeling has previously been used to accurately model the structure of proteins of known structure, but the technique has never been used to determine the structure of a protein of unknown structure. Here, we present the use of HR-HRPF-based modeling to determine the structure of the Ig-like domain of NRG1, a protein with no close homolog of known structure. Independent determination of the protein structure by both HR-HRPF-based modeling and heteronuclear NMR was carried out, with results compared only after both processes were complete. The HR-HRPF-based model was highly similar to the lowest energy NMR model, with a backbone RMSD of 1.6 Å. To our knowledge, this is the first use of HR-HRPF-based modeling to determine a previously uncharacterized protein structure.
more »
« less
- Award ID(s):
- 1750666
- PAR ID:
- 10367072
- Publisher / Repository:
- Nature Publishing Group
- Date Published:
- Journal Name:
- Communications Biology
- Volume:
- 5
- Issue:
- 1
- ISSN:
- 2399-3642
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract The N‐terminal half of the giant cytoskeletal protein obscurin is comprised of more than 50 Ig‐like domains, arranged in tandem. Domains 18–51 are connected to each other through short 5‐residue linkers, and this arrangement has been previously shown to form a semi‐flexible rod in solution. Domains 1–18 generally have slightly longer ~7 residue interdomain linkers, and the multidomain structure and motion conferred by this kind of linker is understudied. Here, we use NMR, SAXS, and MD to show that these longer linkers are associated with significantly more domain/domain flexibility, with the resulting multidomain structure being moderately compact. Further examination of the relationship between interdomain flexibility and linker length shows there is a 5 residue “sweet spot” linker length that results in dual‐domain systems being extended, and conversely that both longer or shorter linkers result in a less extended structure. This detailed knowledge of the obscurin N terminus structure and flexibility allowed for mathematical modeling of domains 1–18, which suggests that this region likely forms tangles if left alone in solution. Given how infrequently protein tangles occur in nature, and given the pathological outcomes that occur when tangles do arise, our data suggest that obscurin is likely either significantly scaffolded or else externally extended in the cell.more » « less
-
Abstract AlphaFold2 has revolutionized protein structure prediction from amino‐acid sequence. In addition to protein structures, high‐resolution dynamics information about various protein regions is important for understanding protein function. Although AlphaFold2 has neither been designed nor trained to predict protein dynamics, it is shown here how the information returned by AlphaFold2 can be used to predict dynamic protein regions at the individual residue level. The approach, which is termed cdsAF2, uses the 3D protein structure returned by AlphaFold2 to predict backbone NMR NHS2order parameters using a local contact model that takes into account the contacts made by each peptide plane along the backbone with its environment. By combining for each residue AlphaFold2's pLDDT confidence score for the structure prediction accuracy with the predictedS2value using the local contact model, an estimator is obtained that semi‐quantitatively captures many of the dynamics features observed in experimental backbone NMR NHS2order parameter profiles. The method is demonstrated for a set nine proteins of different sizes and variable amounts of dynamics and disorder.more » « less
-
Abstract Cryo‐electron microscopy (cryo‐EM) has become a major experimental technique to determine the structures of large protein complexes and molecular assemblies, as evidenced by the 2017 Nobel Prize. Although cryo‐EM has been drastically improved to generate high‐resolution three‐dimensional maps that contain detailed structural information about macromolecules, the computational methods for using the data to automatically build structure models are lagging far behind. The traditional cryo‐EM model building approach is template‐based homology modeling. Manual de novo modeling is very time‐consuming when no template model is found in the database. In recent years, de novo cryo‐EM modeling using machine learning (ML) and deep learning (DL) has ranked among the top‐performing methods in macromolecular structure modeling. DL‐based de novo cryo‐EM modeling is an important application of artificial intelligence, with impressive results and great potential for the next generation of molecular biomedicine. Accordingly, we systematically review the representative ML/DL‐based de novo cryo‐EM modeling methods. Their significances are discussed from both practical and methodological viewpoints. We also briefly describe the background of cryo‐EM data processing workflow. Overall, this review provides an introductory guide to modern research on artificial intelligence for de novo molecular structure modeling and future directions in this emerging field. This article is categorized under:Structure and Mechanism > Molecular StructuresStructure and Mechanism > Computational Biochemistry and BiophysicsData Science > Artificial Intelligence/Machine Learningmore » « less
-
Abstract Protein structure prediction is an important problem in bioinformatics and has been studied for decades. However, there are still few open-source comprehensive protein structure prediction packages publicly available in the field. In this paper, we present our latest open-source protein tertiary structure prediction system—MULTICOM2, an integration of template-based modeling (TBM) and template-free modeling (FM) methods. The template-based modeling uses sequence alignment tools with deep multiple sequence alignments to search for structural templates, which are much faster and more accurate than MULTICOM1. The template-free (ab initio or de novo) modeling uses the inter-residue distances predicted by DeepDist to reconstruct tertiary structure models without using any known structure as template. In the blind CASP14 experiment, the average TM-score of the models predicted by our server predictor based on the MULTICOM2 system is 0.720 for 58 TBM (regular) domains and 0.514 for 38 FM and FM/TBM (hard) domains, indicating that MULTICOM2 is capable of predicting good tertiary structures across the board. It can predict the correct fold for 76 CASP14 domains (95% regular domains and 55% hard domains) if only one prediction is made for a domain. The success rate is increased to 3% for both regular and hard domains if five predictions are made per domain. Moreover, the prediction accuracy of the pure template-free structure modeling method on both TBM and FM targets is very close to the combination of template-based and template-free modeling methods. This demonstrates that the distance-based template-free modeling method powered by deep learning can largely replace the traditional template-based modeling method even on TBM targets that TBM methods used to dominate and therefore provides a uniform structure modeling approach to any protein. Finally, on the 38 CASP14 FM and FM/TBM hard domains, MULTICOM2 server predictors (MULTICOM-HYBRID, MULTICOM-DEEP, MULTICOM-DIST) were ranked among the top 20 automated server predictors in the CASP14 experiment. After combining multiple predictors from the same research group as one entry, MULTICOM-HYBRID was ranked no. 5. The source code of MULTICOM2 is freely available athttps://github.com/multicom-toolbox/multicom/tree/multicom_v2.0.more » « less
An official website of the United States government
