Protein methylation is a vital regulator of many biological processes at the post-translational level, and accurate prediction of protein methylation sites is essential for research and drug discovery. In this paper, we present a new method, namely RMSxAI, to predict the arginine methylation sites from primary sequences using machine learning algorithms and describe the predictions using explainable artificial intelligence (XAI) techniques. Leveraging experimentally validated methylated and unmethylated protein sequences from diverse organisms, we deduced several sequence features, encompassing physicochemical properties, amino acid composition, and evolutionary insights. Our results show that the proposed RMSxAI can predict protein methylation sites with high accuracy, bringing the F1 score up to 0.88 and overall accuracy up to 88.4%. We use various XAI methods to explain the output results. These include key features, partial occupancy maps, and local variation models that provide insight into key features and interactions that lead to predictions. Overall, our approach is relevant to research and drug discovery, and our results demonstrate the potential of machine learning algorithms and XAI methods to provide accurate and meaningful prediction of arginine methylation sites.
Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract -
Abstract Technology offers a lot of potential that is being used to improve the integrity and efficiency of infrastructures. Crack is one of the major concerns that can affect the integrity or usability of any structure. Oftentimes, the use of manual inspection methods leads to delays which can worsen the situation. Automated crack detection has become very necessary for efficient management and inspection of critical infrastructures. Previous research in crack detection employed classification and localization-based models using Deep Convolutional Neural Networks (DCNNs). This study suggests and compares the effectiveness of transfer learned DCNNs for crack detection as a classification model and as a feature extractor to overcome this restriction. The main objective of this paper is to present various methods of crack detection on surfaces and compare their performance over 3 different datasets. Experiments conducted in this work are threefold: initially, the effectiveness of 12 transfer learned DCNN models for crack detection is analyzed on three publicly available datasets: SDNET, CCIC and BSD. With an accuracy of 53.40%, ResNet101 outperformed other models on the SDNET dataset. EfficientNetB0 was the most accurate (98.8%) model on the BSD dataset, and ResNet50 performed better with an accuracy of 99.8% on the CCIC dataset. Secondly, two image enhancement methods are employed to enhance the images and are transferred learned on the 12 DCNNs in pursuance of improving the performance of the SDNET dataset. The results from the experiments show that the enhanced images improved the accuracy of transfer-learned crack detection models significantly. Furthermore, deep features extracted from the last fully connected layer of the DCNNs are used to train the Support Vector Machine (SVM). The integration of deep features with SVM enhanced the detection accuracy across all the DCNN-dataset combinations, according to analysis in terms of accuracy, precision, recall, and F1-score.
-
Abstract In this paper, NeuralProphet (NP), an explainable hybrid modular framework, enhances the forecasting performance of pandemics by adding two neural network modules; auto-regressor (AR) and lagged-regressor (LR). An advanced deep auto-regressor neural network (Deep-AR-Net) model is employed to implement these two modules. The enhanced NP is optimized via AdamW and Huber loss function to perform multivariate multi-step forecasting contrast to Prophet. The models are validated with COVID-19 time-series datasets. The NP’s efficiency is studied component-wise for a long-term forecast for India and an overall reduction of 60.36% and individually 34.7% by AR-module, 53.4% by LR-module in MASE compared to Prophet. The Deep-AR-Net model reduces the forecasting error of NP for all five countries, on average, by 49.21% and 46.07% for short-and-long-term, respectively. The visualizations confirm that forecasting curves are closer to the actual cases but significantly different from Prophet. Hence, it can develop a real-time decision-making system for highly infectious diseases.
-
The ongoing COVID-19 pandemic continues to infect people worldwide, and the virus continues to evolve in significant ways which can pose challenges to the efficiency of available vaccines and therapeutic drugs and cause future pandemic. Therefore, it is important to investigate the binding and interaction of ACE2 with different RBD variants. A comparative study using all-atom MD simulations was conducted on ACE2 binding with 8 different RBD variants, including N501Y, E484K, P479S, T478I, S477N, N439K, K417N and N501YE484K- K417N on RBD. Based on the RMSD, RMSF, and DSSP results, overall the binding of RBD variants with ACE2 is stable, and the secondary structure of RBD and ACE2 are consistent after the point mutation. Besides that, a similar buried surface area, a consistent binding interface and a similar amount of hydrogen bonds formed between RBD and ACE2 although the exact residue pairs on the binding interface were modified. The change of binding free energy from point mutation was predicted using the free energy perturbation (FEP) method. It is found that N501Y, N439K, and K417N can strengthen the binding of RBD with ACE2, while E484K and P479S weaken the binding, and S477N and T478I have negligible effect on the binding. Point mutations modified the dynamic correlation of residues in RBD based on the dihedral angle covariance matrix calculation. Doing dynamic network analysis, a common intrinsic network community extending from the tail of RBD to central, then to the binding interface region was found, which could communicate the dynamics in the binding interface region to the tail thus to the other sections of S protein. The result can supply unique methodology and molecular insight on studying the molecular structure and dynamics of possible future pandemics and design novel drugs.more » « lessFree, publicly-accessible full text available October 5, 2024
-
The global rise in heart disease necessitates precise prediction tools to assess individual risk levels. This paper introduces a novel Multi-Objective Artificial Bee Colony Optimized Hybrid Deep Belief Network and XGBoost (HDBN-XG) algorithm, enhancing coronary heart disease prediction accuracy. Key physiological data, including Electrocardiogram (ECG) readings and blood volume measurements, are analyzed. The HDBN-XG algorithm assesses data quality, normalizes using z-score values, extracts features via the Computational Rough Set method, and constructs feature subsets using the Multi-Objective Artificial Bee Colony approach. Our findings indicate that the HDBN-XG algorithm achieves an accuracy of 99%, precision of 95%, specificity of 98%, sensitivity of 97%, and F1-measure of 96%, outperforming existing classifiers. This paper contributes to predictive analytics by offering a data-driven approach to healthcare, providing insights to mitigate the global impact of coronary heart disease.
Free, publicly-accessible full text available November 16, 2024 -
Abstract Breast cancer has emerged as the most life-threatening disease among women around the world. Early detection and treatment of breast cancer are thought to reduce the need for surgery and boost the survival rate. The Magnetic Resonance Imaging (MRI) segmentation techniques for breast cancer diagnosis are investigated in this article. Kapur’s entropy-based multilevel thresholding is used in this study to determine optimal values for breast DCE-MRI lesion segmentation using Gorilla Troops Optimization (GTO). An improved GTO, is developed by incorporating Rotational opposition based-learning (RBL) into GTO called (GTORBL) and applied it to the same problem. The proposed approaches are tested on 20 patients’ T2 Weighted Sagittal (T2 WS) DCE-MRI 100 slices. The proposed approaches are compared with Tunicate Swarm Algorithm (TSA), Particle Swarm Optimization (PSO), Arithmetic Optimization Algorithm (AOA), Slime Mould Algorithm (SMA), Multi-verse Optimization (MVO), Hidden Markov Random Field (HMRF), Improved Markov Random Field (IMRF), and Conventional Markov Random Field (CMRF). The Dice Similarity Coefficient (DSC), sensitivity, and accuracy of the proposed GTO-based approach is achieved
,$$87.04\%$$ , and$$90.96\%$$ respectively. Another proposed GTORBL-based segmentation method achieves accuracy values of$$98.13\%$$ , sensitivity of$$99.31\%$$ , and DSC of$$95.45\%$$ . The one-way ANOVA test followed by Tukey HSD and Wilcoxon Signed Rank Test are used to examine the results. Furthermore, Multi-Criteria Decision Making is used to evaluate overall performance focused on sensitivity, accuracy, false-positive rate, precision, specificity,$$91.54\%$$ -score, Geometric-Mean, and DSC. According to both quantitative and qualitative findings, the proposed strategies outperform other compared methodologies.$$F_1$$ -
Abstract Road network design, as an important part of landscape modeling, shows a great significance in automatic driving, video game development, and disaster simulation. To date, this task remains labor‐intensive, tedious and time‐consuming. Many improved techniques have been proposed during the last two decades. Nevertheless, most of the state‐of‐the‐art methods still encounter problems of intuitiveness, usefulness and/or interactivity. As a rapid deviation from the conventional road design, this paper advocates an improved road modeling framework for automatic and interactive road production driven by geographical maps (including elevation, water, vegetation maps). Our method integrates the capability of flexible image generation models with powerful transformer architecture to afford a vectorized road network. We firstly construct a dataset that includes road graphs, density map and their corresponding geographical maps. Secondly, we develop a density map generation network based on image translation model with an attention mechanism to predict a road density map. The usage of density map facilitates faster convergence and better performance, which also serves as the input for road graph generation. Thirdly, we employ the transformer architecture to evolve density maps to road graphs. Our comprehensive experimental results have verified the efficiency, robustness and applicability of our newly‐proposed framework for road design.
-
Introduction: Essential genes are essential for the survival of various species. These genes are a family linked to critical cellular activities for species survival. These genes are coded for proteins that regulate central metabolism, gene translation, deoxyribonucleic acid replication, and fundamental cellular structure and facilitate intracellular and extracellular transport. Essential genes preserve crucial genomics information that may hold the key to a detailed knowledge of life and evolution. Essential gene studies have long been regarded as a vital topic in computational biology due to their relevance. An essential gene is composed of adenine, guanine, cytosine, and thymine and its various combinations. Methods: This paper presents a novel method of extracting information on the stationary patterns of nucleotides such as adenine, guanine, cytosine, and thymine in each gene. For this purpose, some co-occurrence matrices are derived that provide the statistical distribution of stationary patterns of nucleotides in the genes, which is helpful in establishing the relationship between the nucleotides. For extracting discriminant features from each co-occurrence matrix, energy, entropy, homogeneity, contrast, and dissimilarity features are computed, which are extracted from all co-occurrence matrices and then concatenated to form a feature vector representing each essential gene. Finally, supervised machine learning algorithms are applied for essential gene classification based on the extracted fixed-dimensional feature vectors. Results: For comparison, some existing state-of-the-art feature representation techniques such as Shannon entropy (SE), Hurst exponent (HE), fractal dimension (FD), and their combinations have been utilized. Discussion: An extensive experiment has been performed for classifying the essential genes of five species that show the robustness and effectiveness of the proposed methodology.more » « less
-
Bonomo, Robert A. (Ed.)ABSTRACT Microbial diversity is reduced in the gut microbiota of animals and humans treated with selective serotonin reuptake inhibitors (SSRIs) and tricyclic antidepressants (TCAs). The mechanisms driving the changes in microbial composition, while largely unknown, is critical to understand considering that the gut microbiota plays important roles in drug metabolism and brain function. Using Escherichia coli , we show that the SSRI fluoxetine and the TCA amitriptyline exert strong selection pressure for enhanced efflux activity of the AcrAB-TolC pump, a member of the resistance-nodulation-cell division (RND) superfamily of transporters. Sequencing spontaneous fluoxetine- and amitriptyline-resistant mutants revealed mutations in marR and lon, negative regulators of AcrAB-TolC expression. In line with the broad specificity of AcrAB-TolC pumps these mutants conferred resistance to several classes of antibiotics. We show that the converse also occurs, as spontaneous chloramphenicol-resistant mutants displayed cross-resistance to SSRIs and TCAs. Chemical-genomic screens identified deletions in marR and lon, confirming the results observed for the spontaneous resistant mutants. In addition, deletions in 35 genes with no known role in drug resistance were identified that conferred cross-resistance to antibiotics and several displayed enhanced efflux activities. These results indicate that combinations of specific antidepressants and antibiotics may have important effects when both are used simultaneously or successively as they can impose selection for common mechanisms of resistance. Our work suggests that selection for enhanced efflux activities is an important factor to consider in understanding the microbial diversity changes associated with antidepressant treatments. IMPORTANCE Antidepressants are prescribed broadly for psychiatric conditions to alter neuronal levels of synaptic neurotransmitters such as serotonin and norepinephrine. Two categories of antidepressants are selective serotonin reuptake inhibitors (SSRIs) and tricyclic antidepressants (TCAs); both are among the most prescribed drugs in the United States. While it is well-established that antidepressants inhibit reuptake of neurotransmitters there is evidence that they also impact microbial diversity in the gastrointestinal tract. However, the mechanisms and therefore biological and clinical effects remain obscure. We demonstrate antidepressants may influence microbial diversity through strong selection for mutant bacteria with increased AcrAB-TolC activity, an efflux pump that removes antibiotics from cells. Furthermore, we identify a new group of genes that contribute to cross-resistance between antidepressants and antibiotics, several act by regulating efflux activity, underscoring overlapping mechanisms. Overall, this work provides new insights into bacterial responses to antidepressants important for understanding antidepressant treatment effects.more » « less