Abstract New drug production, from target identification to marketing approval, takes over 12 years and can cost around $2.6 billion. Furthermore, the COVID-19 pandemic has unveiled the urgent need for more powerful computational methods for drug discovery. Here, we review the computational approaches to predicting protein–ligand interactions in the context of drug discovery, focusing on methods using artificial intelligence (AI). We begin with a brief introduction to proteins (targets), ligands (e.g. drugs) and their interactions for nonexperts. Next, we review databases that are commonly used in the domain of protein–ligand interactions. Finally, we survey and analyze the machine learning (ML) approaches implemented to predict protein–ligand binding sites, ligand-binding affinity and binding pose (conformation) including both classical ML algorithms and recent deep learning methods. After exploring the correlation between these three aspects of protein–ligand interaction, it has been proposed that they should be studied in unison. We anticipate that our review will aid exploration and development of more accurate ML-based prediction strategies for studying protein–ligand interactions. 
                        more » 
                        « less   
                    
                            
                            Machine Learning Methods in Drug Discovery
                        
                    
    
            The advancements of information technology and related processing techniques have created a fertile base for progress in many scientific fields and industries. In the fields of drug discovery and development, machine learning techniques have been used for the development of novel drug candidates. The methods for designing drug targets and novel drug discovery now routinely combine machine learning and deep learning algorithms to enhance the efficiency, efficacy, and quality of developed outputs. The generation and incorporation of big data, through technologies such as high-throughput screening and high through-put computational analysis of databases used for both lead and target discovery, has increased the reliability of the machine learning and deep learning incorporated techniques. The use of these virtual screening and encompassing online information has also been highlighted in developing lead synthesis pathways. In this review, machine learning and deep learning algorithms utilized in drug discovery and associated techniques will be discussed. The applications that produce promising results and methods will be reviewed. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10275980
- Date Published:
- Journal Name:
- Molecules
- Volume:
- 25
- Issue:
- 22
- ISSN:
- 1420-3049
- Page Range / eLocation ID:
- 5277
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Abstract Small molecules have been playing a crucial role in drug discovery; however, some exhibit nonspecific inhibitory effects during hit screening due to the formation of colloidal aggregators. Such false positives often lead to significant research costs and time investment. Therefore, to identify potential aggregating compounds efficiently and accurately at an early stage of drug discovery, we employed several machine learning techniques to develop classification models for identifying promiscuous aggregating inhibitors. Using a training dataset of 10 000 aggregators and 10 000 nonaggregators, models were trained by combining four different molecular representations with various machine learning algorithms. We found that the best-performing model is the one that employs path-based FP2 fingerprints in conjunction with the cubic support vector machine algorithm, which achieved the highest accuracy and area under the receiver operating characteristic curve values for both the validation and test datasets while maintaining high sensitivity and specificity levels (>0.93). Additionally, we have proposed a new model interpretation method, global sensitivity analysis (GSA), to complement the well-recognized SHapley Additive exPlanations analysis. Several comparative studies have shown that GSA is a time-efficient and accurate approach for identifying crucial descriptors that contribute to model prediction, especially in the scenario where the dataset contains a substantial number of data entries with a limited set of descriptors. Our models as well as GSA findings can provide useful guidance on screening library design to minimize false positives.more » « less
- 
            Recently, molecular fingerprints extracted from three-dimensional (3D) structures using advanced mathematics, such as algebraic topology, differential geometry, and graph theory have been paired with efficient machine learning, especially deep learning algorithms to outperform other methods in drug discovery applications and competitions. This raises the question of whether classical 2D fingerprints are still valuable in computer-aided drug discovery. This work considers 23 datasets associated with four typical problems, namely protein–ligand binding, toxicity, solubility and partition coefficient to assess the performance of eight 2D fingerprints. Advanced machine learning algorithms including random forest, gradient boosted decision tree, single-task deep neural network and multitask deep neural network are employed to construct efficient 2D-fingerprint based models. Additionally, appropriate consensus models are built to further enhance the performance of 2D-fingerprint-based methods. It is demonstrated that 2D-fingerprint-based models perform as well as the state-of-the-art 3D structure-based models for the predictions of toxicity, solubility, partition coefficient and protein–ligand binding affinity based on only ligand information. However, 3D structure-based models outperform 2D fingerprint-based methods in complex-based protein–ligand binding affinity predictions.more » « less
- 
            Abstract A machine learning-based drug screening technique has been developed and optimized using convolutional neural network-derived fingerprints. The optimization of weights in the neural network-based fingerprinting technique was compared with fixed Morgan fingerprints in regard to binary classification on drug-target binding affinity. The assessment was carried out using six different target proteins using randomly chosen small molecules from the ZINC15 database for training. This new architecture proved to be more efficient in screening molecules that less favorably bind to specific targets and retaining molecules that favorably bind to it. Scientific contribution We have developed a new neural fingerprint-based screening model that has a significant ability to capture hits. Despite using a smaller dataset, this model is capable of mapping chemical space similar to other contemporary algorithms designed for molecular screening. The novelty of the present algorithm lies in the speed with which the models are trained and tuned before testing its predictive capabilities and hence is a significant step forward in the field of machine learning-embedded computational drug discovery.more » « less
- 
            With recent advances in microfabrication technologies, the miniaturization of traditional culturing techniques has provided ideal methods for interrogating microbial communities in a confined and finely controlled environment. Micro-technologies offer high-throughput screening and analysis, reduced experimental time and resources, and have low footprint. More importantly, they provide access to culturing microbes in situ in their natural environments and similarly, offer optical access to real-time dynamics under a microscope. Utilizing micro-technologies for the discovery, isolation and cultivation of “unculturable” species will propel many fields forward; drug discovery, point-of-care diagnostics, and fundamental studies in microbial community behaviors rely on the exploration of novel metabolic pathways. However, micro-technologies are still largely proof-of-concept, and scalability and commercialization of micro-technologies will require increased accessibility to expensive equipment and resources, as well as simpler designs for usability. Here, we discuss three different miniaturized culturing practices; including microarrays, micromachined devices, and microfluidics; advancements to the field, and perceived challenges.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    