skip to main content

Title: DeepCryoPicker: fully automated deep neural network for single protein particle picking in cryo-EM
Abstract Background Cryo-electron microscopy (Cryo-EM) is widely used in the determination of the three-dimensional (3D) structures of macromolecules. Particle picking from 2D micrographs remains a challenging early step in the Cryo-EM pipeline due to the diversity of particle shapes and the extremely low signal-to-noise ratio of micrographs. Because of these issues, significant human intervention is often required to generate a high-quality set of particles for input to the downstream structure determination steps. Results Here we propose a fully automated approach (DeepCryoPicker) for single particle picking based on deep learning. It first uses automated unsupervised learning to generate particle training datasets. Then it trains a deep neural network to classify particles automatically. Results indicate that the DeepCryoPicker compares favorably with semi-automated methods such as DeepEM, DeepPicker, and RELION, with the significant advantage of not requiring human intervention. Conclusions Our framework combing supervised deep learning classification with automated un-supervised clustering for generating training data provides an effective approach to pick particles in cryo-EM images automatically and accurately.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
BMC Bioinformatics
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract Background Cryo-EM data generated by electron tomography (ET) contains images for individual protein particles in different orientations and tilted angles. Individual cryo-EM particles can be aligned to reconstruct a 3D density map of a protein structure. However, low contrast and high noise in particle images make it challenging to build 3D density maps at intermediate to high resolution (1–3 Å). To overcome this problem, we propose a fully automated cryo-EM 3D density map reconstruction approach based on deep learning particle picking. Results A perfect 2D particle mask is fully automatically generated for every single particle. Then, it uses a computer vision image alignment algorithm (image registration) to fully automatically align the particle masks. It calculates the difference of the particle image orientation angles to align the original particle image. Finally, it reconstructs a localized 3D density map between every two single-particle images that have the largest number of corresponding features. The localized 3D density maps are then averaged to reconstruct a final 3D density map. The constructed 3D density map results illustrate the potential to determine the structures of the molecules using a few samples of good particles. Also, using the localized particle samples (with no background) to generate the localized 3D density maps can improve the process of the resolution evaluation in experimental maps of cryo-EM. Tested on two widely used datasets, Auto3DCryoMap is able to reconstruct good 3D density maps using only a few thousand protein particle images, which is much smaller than hundreds of thousands of particles required by the existing methods. Conclusions We design a fully automated approach for cryo-EM 3D density maps reconstruction (Auto3DCryoMap). Instead of increasing the signal-to-noise ratio by using 2D class averaging, our approach uses 2D particle masks to produce locally aligned particle images. Auto3DCryoMap is able to accurately align structural particle shapes. Also, it is able to construct a decent 3D density map from only a few thousand aligned particle images while the existing tools require hundreds of thousands of particle images. Finally, by using the pre-processed particle images,Auto3DCryoMap reconstructs a better 3D density map than using the original particle images. 
    more » « less
  2. CryoDRGN is a machine learning system for heterogenous cryo-EM reconstruction of proteins and protein complexes from single particle cryo-EM data. Central to this approach is a deep generative model for heterogeneous cryo-EM density maps, which we empirically find effectively models both discrete and continuous forms of structural variability. Once trained, cryoDRGN is capable of generating an arbitrary number of 3D density maps, and thus interpreting the resulting ensemble is a challenge. Here, we showcase interactive and automated processing approaches for analyzing cryoDRGN results. Specifically, we detail a step-by-step protocol for analysis of the assembling 50S ribosome dataset (Davis et al., EMPIAR-10076), including preparation of inputs, network training, and visualization of the resulting ensemble of density maps. Additionally, we describe and implement methods to comprehensively analyze and interpret the distribution of volumes with the assistance of an associated atomic model. This protocol is appropriate for structural biologists familiar with processing single particle cryo-EM datasets and with moderate experience navigating Python and Jupyter notebooks. It requires 3-4 days to complete. 
    more » « less
  3. Abstract

    Cryo‐electron microscopy (cryo‐EM) has become a major experimental technique to determine the structures of large protein complexes and molecular assemblies, as evidenced by the 2017 Nobel Prize. Although cryo‐EM has been drastically improved to generate high‐resolution three‐dimensional maps that contain detailed structural information about macromolecules, the computational methods for using the data to automatically build structure models are lagging far behind. The traditional cryo‐EM model building approach is template‐based homology modeling. Manual de novo modeling is very time‐consuming when no template model is found in the database. In recent years, de novo cryo‐EM modeling using machine learning (ML) and deep learning (DL) has ranked among the top‐performing methods in macromolecular structure modeling. DL‐based de novo cryo‐EM modeling is an important application of artificial intelligence, with impressive results and great potential for the next generation of molecular biomedicine. Accordingly, we systematically review the representative ML/DL‐based de novo cryo‐EM modeling methods. Their significances are discussed from both practical and methodological viewpoints. We also briefly describe the background of cryo‐EM data processing workflow. Overall, this review provides an introductory guide to modern research on artificial intelligence for de novo molecular structure modeling and future directions in this emerging field.

    This article is categorized under:

    Structure and Mechanism > Molecular Structures

    Structure and Mechanism > Computational Biochemistry and Biophysics

    Data Science > Artificial Intelligence/Machine Learning

    more » « less
  4. null (Ed.)
    Information about macromolecular structure of protein complexes and related cellular and molecular mechanisms can assist the search for vaccines and drug development processes. To obtain such structural information, we present DeepTracer, a fully automated deep learning-based method for fast de novo multichain protein complex structure determination from high-resolution cryoelectron microscopy (cryo-EM) maps. We applied DeepTracer on a previously published set of 476 raw experimental cryo-EM maps and compared the results with a current state of the art method. The residue coverage increased by over 30% using DeepTracer, and the rmsd value improved from 1.29 Å to 1.18 Å. Additionally, we applied DeepTracer on a set of 62 coronavirus-related cryo-EM maps, among them 10 with no deposited structure available in EMDataResource. We observed an average residue match of 84% with the deposited structures and an average rmsd of 0.93 Å. Additional tests with related methods further exemplify DeepTracer’s competitive accuracy and efficiency of structure modeling. DeepTracer allows for exceptionally fast computations, making it possible to trace around 60,000 residues in 350 chains within only 2 h. The web service is globally accessible at . 
    more » « less
  5. A method is proposed to reconstruct the 3D molecular structure from micrographs collected at just one sample tilt angle in the random conical tilt scheme in cryo-electron microscopy. The method uses autocorrelation analysis on the micrographs to estimate features of the molecule which are invariant under certain nuisance parameters such as the positions of molecular projections in the micrographs. This enables the molecular structure to be reconstructed directly from micrographs, completely circumventing the need for particle picking. Reconstructions are demonstrated with simulated data and the effect of the missing-cone region is investigated. These results show promise to reduce the size limit for single-particle reconstruction in cryo-electron microscopy. 
    more » « less