NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

DEMO-EM2: assembling protein complex structures from cryo-EM maps through intertwined chain and domain fitting

https://doi.org/10.1093/bib/bbae113

Zhang, Ziying; Cai, Yaxian; Zhang, Biao; Zheng, Wei; Freddolino, Lydia; Zhang, Guijun; Zhou, Xiaogen (March 2024, Briefings in Bioinformatics)

Abstract The breakthrough in cryo-electron microscopy (cryo-EM) technology has led to an increasing number of density maps of biological macromolecules. However, constructing accurate protein complex atomic structures from cryo-EM maps remains a challenge. In this study, we extend our previously developed DEMO-EM to present DEMO-EM2, an automated method for constructing protein complex models from cryo-EM maps through an iterative assembly procedure intertwining chain- and domain-level matching and fitting for predicted chain models. The method was carefully evaluated on 27 cryo-electron tomography (cryo-ET) maps and 16 single-particle EM maps, where DEMO-EM2 models achieved an average TM-score of 0.92, outperforming those of state-of-the-art methods. The results demonstrate an efficient method that enables the rapid and reliable solution of challenging cryo-EM structure modeling problems.
more » « less
LOMETS3: integrating deep learning and profile alignment for advanced protein template recognition and function annotation

https://doi.org/10.1093/nar/gkac248

Zheng, Wei; Wuyun, Qiqige; Zhou, Xiaogen; Li, Yang; Freddolino, Lydia; Zhang, Yang (April 2022, Nucleic Acids Research)

Abstract Deep learning techniques have significantly advanced the field of protein structure prediction. LOMETS3 (https://zhanglab.ccmb.med.umich.edu/LOMETS/) is a new generation meta-server approach to template-based protein structure prediction and function annotation, which integrates newly developed deep learning threading methods. For the first time, we have extended LOMETS3 to handle multi-domain proteins and to construct full-length models with gradient-based optimizations. Starting from a FASTA-formatted sequence, LOMETS3 performs four steps of domain boundary prediction, domain-level template identification, full-length template/model assembly and structure-based function prediction. The output of LOMETS3 contains (i) top-ranked templates from LOMETS3 and its component threading programs, (ii) up to 5 full-length structure models constructed by L-BFGS (limited-memory Broyden–Fletcher–Goldfarb–Shanno algorithm) optimization, (iii) the 10 closest Protein Data Bank (PDB) structures to the target, (iv) structure-based functional predictions, (v) domain partition and assembly results, and (vi) the domain-level threading results, including items (i)–(iii) for each identified domain. LOMETS3 was tested in large-scale benchmarks and the blind CASP14 (14th Critical Assessment of Structure Prediction) experiment, where the overall template recognition and function prediction accuracy is significantly beyond its predecessors and other state-of-the-art threading approaches, especially for hard targets without homologous templates in the PDB. Based on the improved developments, LOMETS3 should help significantly advance the capability of broader biomedical community for template-based protein structure and function modelling.
more » « less
Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks

https://doi.org/10.1371/journal.pcbi.1008865

Li, Yang; Zhang, Chengxin; Bell, Eric W.; Zheng, Wei; Zhou, Xiaogen; Yu, Dong-Jun; Zhang, Yang (March 2021, PLOS Computational Biology)
Kolodny, Rachel (Ed.)
The topology of protein folds can be specified by the inter-residue contact-maps and accurate contact-map prediction can help ab initio structure folding. We developed TripletRes to deduce protein contact-maps from discretized distance profiles by end-to-end training of deep residual neural-networks. Compared to previous approaches, the major advantage of TripletRes is in its ability to learn and directly fuse a triplet of coevolutionary matrices extracted from the whole-genome and metagenome databases and therefore minimize the information loss during the course of contact model training. TripletRes was tested on a large set of 245 non-homologous proteins from CASP 11&12 and CAMEO experiments and outperformed other top methods from CASP12 by at least 58.4% for the CASP 11&12 targets and 44.4% for the CAMEO targets in the top- L long-range contact precision. On the 31 FM targets from the latest CASP13 challenge, TripletRes achieved the highest precision (71.6%) for the top- L /5 long-range contact predictions. It was also shown that a simple re-training of the TripletRes model with more proteins can lead to further improvement with precisions comparable to state-of-the-art methods developed after CASP13. These results demonstrate a novel efficient approach to extend the power of deep convolutional networks for high-accuracy medium- and long-range protein contact-map predictions starting from primary sequences, which are critical for constructing 3D structure of proteins that lack homologous templates in the PDB library.
more » « less
Full Text Available
Protein Structure and Sequence Reanalysis of 2019-nCoV Genome Refutes Snakes as Its Intermediate Host and the Unique Similarity between Its Spike Protein Insertions and HIV-1

https://doi.org/10.1021/acs.jproteome.0c00129

Zhang, Chengxin; Zheng, Wei; Huang, Xiaoqiang; Bell, Eric W.; Zhou, Xiaogen; Zhang, Yang (April 2020, Journal of Proteome Research)

Full Text Available
FUpred: detecting protein domains through deep-learning-based contact map prediction

https://doi.org/10.1093/bioinformatics/btaa217

Zheng, Wei; Zhou, Xiaogen; Wuyun, Qiqige; Pearce, Robin; Li, Yang; Zhang, Yang; Elofsson, Arne (March 2020, Bioinformatics)

Abstract Motivation Protein domains are subunits that can fold and function independently. Correct domain boundary assignment is thus a critical step toward accurate protein structure and function analyses. There is, however, no efficient algorithm available for accurate domain prediction from sequence. The problem is particularly challenging for proteins with discontinuous domains, which consist of domain segments that are separated along the sequence. Results We developed a new algorithm, FUpred, which predicts protein domain boundaries utilizing contact maps created by deep residual neural networks coupled with coevolutionary precision matrices. The core idea of the algorithm is to retrieve domain boundary locations by maximizing the number of intra-domain contacts, while minimizing the number of inter-domain contacts from the contact maps. FUpred was tested on a large-scale dataset consisting of 2549 proteins and generated correct single- and multi-domain classifications with a Matthew’s correlation coefficient of 0.799, which was 19.1% (or 5.3%) higher than the best machine learning (or threading)-based method. For proteins with discontinuous domains, the domain boundary detection and normalized domain overlapping scores of FUpred were 0.788 and 0.521, respectively, which were 17.3% and 23.8% higher than the best control method. The results demonstrate a new avenue to accurately detect domain composition from sequence alone, especially for discontinuous, multi-domain proteins. Availability and implementation https://zhanglab.ccmb.med.umich.edu/FUpred. Supplementary information Supplementary data are available at Bioinformatics online.
more » « less
Full Text Available
Research in Computational Molecular Biology 27th Annual International Conference, RECOMB 2023, Istanbul, Turkey, April 16–19, 2023, Proceedings

Luo, Runpeng; Lin, Yu; Fan, Jason; Khan, Jamshed; Pibiri, Giulio_Ermanno; Patro, Rob; Tabatabaee, Yasamin; Roch, Sébastien; Warnow, Tandy; Chandra, Ghanshyam; et al (April 2023, Springer Cham)
Tang, Haixu (Ed.)
This book constitutes the refereed proceedings of the 27th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2023, held in Istanbul, Turkey, from April 16–19, 2023. The 11 regular and 33 short papers presented in this book were carefully reviewed and selected from 188 submissions. The papers report on original research in all areas of computational molecular biology and bioinformatics.
more » « less
Full Text Available

Search for: All records