Abstract MotivationHigh-quality curation of the proteins and interactions in signaling pathways is slow and painstaking. As a result, many experimentally detected interactions are not annotated to any pathways. A natural question that arises is whether or not it is possible to automatically leverage existing pathway annotations to identify new interactions for inclusion in a given pathway. ResultsWe present RegLinker, an algorithm that achieves this purpose by computing multiple short paths from pathway receptors to transcription factors within a background interaction network. The key idea underlying RegLinker is the use of regular language constraints to control the number of non-pathway interactions that are present in the computed paths. We systematically evaluate RegLinker and five alternative approaches against a comprehensive set of 15 signaling pathways and demonstrate that RegLinker recovers withheld pathway proteins and interactions with the best precision and recall. We used RegLinker to propose new extensions to the pathways. We discuss the literature that supports the inclusion of these proteins in the pathways. These results show the broad potential of automated analysis to attenuate difficulties of traditional manual inquiry. Availability and implementationhttps://github.com/Murali-group/RegLinker. Supplementary informationSupplementary data are available at Bioinformatics online. 
                        more » 
                        « less   
                    
                            
                            Growing Directed Acyclic Graphs: Optimization Functions for Pathway Reconstruction Algorithms
                        
                    
    
            A major challenge in molecular systems biology is to understand how proteins work to transmit external signals to changes in gene expression. Computationally reconstructing these signaling pathways from protein interaction networks can help understand what is missing from existing pathway databases. We formulate a new pathway reconstruction problem, one that iteratively grows directed acyclic graphs (DAGs) from a set of starting proteins in a protein interaction network. We present an algorithm that provably returns the optimal DAGs for two different cost functions and evaluate the pathway reconstructions when applied to six diverse signaling pathways from the NetPath database. The optimal DAGs outperform an existing k-shortest paths method for pathway reconstruction, and the new reconstructions are enriched for different biological processes. Growing DAGs is a promising step toward reconstructing pathways that provably optimize a specific cost function. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1750981
- PAR ID:
- 10515531
- Publisher / Repository:
- Mary Ann Liebert, Inc.
- Date Published:
- Journal Name:
- Journal of Computational Biology
- Volume:
- 30
- Issue:
- 7
- ISSN:
- 1557-8666
- Page Range / eLocation ID:
- 814 to 828
- Subject(s) / Keyword(s):
- directed acyclic graphs, graph algorithms, pathway reconstruction, signaling pathways
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            null (Ed.)Abstract Proteins are tiny players involved in the activation and deactivation of multiple signaling cascades through interactions in cells. The TNFR1 and MADD interact with each other and mediate downstream protein signaling pathways which cause neuronal cell death and Alzheimer’s disease. In the current study, a molecular docking approach was employed to explore the interactive behavior of TNFR1 and MADD proteins and their role in the activation of downstream signaling pathways. The computational sequential and structural conformational results revealed that Asp400, Arg58, Arg59 were common residues of TNFR1 and MADD which are involved in the activation of downstream signaling pathways. Aspartic acid in negatively charged residues is involved in the biosynthesis of protein. However, arginine is a positively charged residue with the potential to interact with oppositely charged amino acids. Furthermore, our molecular dynamic simulation results also ensured the stability of the backbone of TNFR1 and MADD death domains (DDs) in binding interactions. This DDs interaction mediates some conformational changes in TNFR1 which leads to the activation of mediators proteins in the cellular signaling pathways. Taken together, a better understanding of TNFR1 and MADD receptors and their activated signaling cascade may help treat Alzheimer’s disease. The death domains of TNFR1 and MADD could be used as a novel pharmacological target for the treatment of Alzheimer’s disease by inhibiting the MAPK pathway.more » « less
- 
            Abstract Background Cell signaling pathways, which are a series of reactions that start at receptors and end at transcription factors, are basic to systems biology. Properly modeling the reactions in such pathways requires directed hypergraphs , where an edge is now directed between two sets of vertices. Inferring a pathway by the most parsimonious series of reactions corresponds to finding a shortest hyperpath in a directed hypergraph, which is NP-complete. The current state-of-the-art for shortest hyperpaths in cell signaling hypergraphs solves a mixed-integer linear program to find an optimal hyperpath that is restricted to be acyclic, and offers no efficiency guarantees. Results We present, for the first time, a heuristic for general shortest hyperpaths that properly handles cycles , and is guaranteed to be efficient . We show the heuristic finds provably optimal hyperpaths for the class of singleton-tail hypergraphs, and also give a practical algorithm for tractably generating all source-sink hyperpaths. The accuracy of the heuristic is demonstrated through comprehensive experiments on all source-sink instances from the standard NCI-PID and Reactome pathway databases, which show it finds a hyperpath that matches the state-of-the-art mixed-integer linear program on over 99% of all instances that are acyclic. On instances where only cyclic hyperpaths exist, the heuristic surpasses the state-of-the-art, which finds no solution; on every such cyclic instance, enumerating all source-sink hyperpaths shows the solution found by the heuristic was in fact optimal . Conclusions The new shortest hyperpath heuristic is both fast and accurate . This makes finding source-sink hyperpaths, which in general may contain cycles, now practical for real cell signaling networks. Availability Source code for the hyperpath heuristic in a new tool we call (as well as for hyperpath enumeration, and all dataset instances) is available free for non-commercial use at .more » « less
- 
            Synopsis The gill proteome of threespine sticklebacks (Gasterosteus aculeatus) differs greatly in populations that inhabit diverse environments characterized by different temperature, salinity, food availability, parasites, and other parameters. To assess the contribution of a specific environmental parameter to such differences it is necessary to isolate its effects from those of other parameters. In this study the effect of environmental salinity on the gill proteome of G. aculeatus was isolated in controlled mesocosm experiments. Salinity-dependent changes in the gill proteome were analyzed by Liquid chromatography/Tandem mass spectrometry data-independent acquisition (DIA) and Skyline. Relative abundances of 1691 proteins representing the molecular phenotype of stickleback gills were quantified using previously developed MSMS spectral and assay libraries in combination with DIA quantitative proteomics. Non-directional stress responses were distinguished from osmoregulatory protein abundance changes by their consistent occurrence during both hypo- and hyper-osmotic salinity stress in six separate mesocosm experiments. If the abundance of a protein was consistently regulated in opposite directions by hyper- versus hypo-osmotic salinity stress, then it was considered an osmoregulatory protein. In contrast, if protein abundance was consistently increased irrespective of whether salinity was increased or decreased, then it was considered a non-directional response protein. KEGG pathway analysis revealed that the salivary secretion, inositol phosphate metabolism, valine, leucine, and isoleucine degradation, citrate cycle, oxidative phosphorylation, and corresponding endocrine and extracellular signaling pathways contain most of the osmoregulatory gill proteins whose abundance is directly proportional to environmental salinity. Most proteins that were inversely correlated with salinity map to KEGG pathways that represent proteostasis, immunity, and related intracellular signaling processes. Non-directional stress response proteins represent fatty and amino acid degradation, purine metabolism, focal adhesion, mRNA surveillance, phagosome, endocytosis, and associated intracellular signaling KEGG pathways. These results demonstrate that G. aculeatus responds to salinity changes by adjusting osmoregulatory mechanisms that are distinct from transient non-directional stress responses to control compatible osmolyte synthesis, transepithelial ion transport, and oxidative energy metabolism. Furthermore, this study establishes salinity as a key factor for causing the regulation of numerous proteins and KEGG pathways with established functions in proteostasis, immunity, and tissue remodeling. We conclude that the corresponding osmoregulatory gill proteins and KEGG pathways represent molecular phenotypes that promote transepithelial ion transport, cellular osmoregulation, and gill epithelial remodeling to adjust gill function to environmental salinity.more » « less
- 
            Research of Protein-Protein Interaction (PPI) Network Alignment is playing an important role in understanding the crucial underlying biological knowledge such as functionally homologous proteins and conserved evolutionary pathways across different species. Existing methods of PPI network alignment often try to improve the coverage ratio of the alignment result by aligning all proteins from different species. However, there is a fundamental biological premise that needs to be considered carefully: not every protein in a species can, nor should, find its homologous proteins in other species. In this work, we propose a novel alignment method to map only those proteins with the most similarity throughout the PPI networks of multiple species. For the similarity features of the protein in the networks, we integrate both topological features with biological characteristics to provide enhanced supports for the alignment procedures. For topological features, we apply a representation learning method on the networks that can generate a low dimensional vector embedding with its surrounding structural features for each protein. The topological similarity of proteins from different PPI networks can thus be transferred as the similarity of their corresponding vector representations, which provides a new way to comprehensively quantify the topological similarities between proteins. We also propose a new measure for the topological evaluation of the alignment results which better uncover the structural quality of the alignment across multiple networks. Both biological and topological evaluations on the alignment results of real datasets demonstrate our approach is promising and preferable against previous multiple alignment methodsmore » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    