NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Interpretable network propagation with application to expanding the repertoire of human proteins that interact with SARS-CoV-2

https://doi.org/10.1093/gigascience/giab082

Law, Jeffrey_N; Akers, Kyle; Tasnina, Nure; Santina, Catherine_M_Della; Deutsch, Shay; Kshirsagar, Meghana; Klein-Seetharaman, Judith; Crovella, Mark; Rajagopalan, Padmavathy; Kasif, Simon; et al (December 2021, GigaScience)

Abstract BackgroundNetwork propagation has been widely used for nearly 20 years to predict gene functions and phenotypes. Despite the popularity of this approach, little attention has been paid to the question of provenance tracing in this context, e.g., determining how much any experimental observation in the input contributes to the score of every prediction. ResultsWe design a network propagation framework with 2 novel components and apply it to predict human proteins that directly or indirectly interact with SARS-CoV-2 proteins. First, we trace the provenance of each prediction to its experimentally validated sources, which in our case are human proteins experimentally determined to interact with viral proteins. Second, we design a technique that helps to reduce the manual adjustment of parameters by users. We find that for every top-ranking prediction, the highest contribution to its score arises from a direct neighbor in a human protein-protein interaction network. We further analyze these results to develop functional insights on SARS-CoV-2 that expand on known biology such as the connection between endoplasmic reticulum stress, HSPA5, and anti-clotting agents. ConclusionsWe examine how our provenance-tracing method can be generalized to a broad class of network-based algorithms. We provide a useful resource for the SARS-CoV-2 community that implicates many previously undocumented proteins with putative functional relationships to viral infection. This resource includes potential drugs that can be opportunistically repositioned to target these proteins. We also discuss how our overall framework can be extended to other, newly emerging viruses.
more » « less
Reconstructing signaling pathways using regular language constrained paths

https://doi.org/10.1093/bioinformatics/btz360

Wagner, Mitchell J.; Pratapa, Aditya; Murali, T. M. (July 2019, Bioinformatics)

Abstract MotivationHigh-quality curation of the proteins and interactions in signaling pathways is slow and painstaking. As a result, many experimentally detected interactions are not annotated to any pathways. A natural question that arises is whether or not it is possible to automatically leverage existing pathway annotations to identify new interactions for inclusion in a given pathway. ResultsWe present RegLinker, an algorithm that achieves this purpose by computing multiple short paths from pathway receptors to transcription factors within a background interaction network. The key idea underlying RegLinker is the use of regular language constraints to control the number of non-pathway interactions that are present in the computed paths. We systematically evaluate RegLinker and five alternative approaches against a comprehensive set of 15 signaling pathways and demonstrate that RegLinker recovers withheld pathway proteins and interactions with the best precision and recall. We used RegLinker to propose new extensions to the pathways. We discuss the literature that supports the inclusion of these proteins in the pathways. These results show the broad potential of automated analysis to attenuate difficulties of traditional manual inquiry. Availability and implementationhttps://github.com/Murali-group/RegLinker. Supplementary informationSupplementary data are available at Bioinformatics online.
more » « less
Provenance Tracing in Network Diffusion Algorithms

Tasnina, Nure; Crovella, Mark; Kasif, Simon; Murali, T M (January 2026, Proceedings of the Pacific Symposium on Biocomputing)

We propose a novel strategy for provenance tracing in random walk-based network diffusion algorithms, a problem that has been surprisingly overlooked in spite of the widespread use of diffusion algorithms in biological applications. Our path-based approach enables ranking paths by the magnitude of their contribution to each node’s score, offering insight into how information propagates through a network. Building on this capability, we introduce two quantitative measures: (i) path-based effective diffusion, which evaluates how well a diffusion algorithm leverages the full topology of a network, and (ii) diffusion betweenness, which quantifies a node’s importance in propagating scores. We applied our framework to SARS-CoV-2 protein interactors and human PPI networks. Provenance tracing of the Regularized Laplacian and Random Walk with Restart algorithms revealed that a substantial amount of a node’s score is contributed via multi-edge paths, demonstrating that diffusion algorithms exploit the non-local structure of the network. Analysis of diffusion betweenness identified proteins playing a critical role in score propagation; proteins with high diffusion betweenness are enriched with essential human genes and interactors of other viruses, supporting the biological interpretability of the metric. Finally, in a signaling network composed of causal interactions between human proteins, the top contributing paths showed strong overlap with COVID-19-related pathways. These results suggest that our path-based framework offers valuable insight into diffusion algorithms and can serve as a powerful tool for interpreting diffusion scores in a biologically meaningful context, complementing existing module- ornode-centric approaches in systems biology. The code is publicly available at https:// github.com/n-tasnina/provenance-tracing.git under the GNU General Public License v3.0.
more » « less
Free, publicly-accessible full text available January 3, 2027
Computational Construction of Toxicant Signaling Networks

https://doi.org/10.1021/acs.chemrestox.2c00422

Law, Jeffrey N.; Orbach, Sophia M.; Weston, Bronson R.; Steele, Peter A.; Rajagopalan, Padmavathy; Murali, T. M. (July 2023, Chemical Research in Toxicology)
Modeling and analysis of the macronutrient signaling network in budding yeast

https://doi.org/10.1091/mbc.E20-02-0117

Jalihal, Amogh P.; Kraikivski, Pavel; Murali, T. M.; Tyson, John J. (November 2021, Molecular Biology of the Cell)
Edelstein-Keshet, Leah (Ed.)
Adaptive modulation of the global cellular growth state of unicellular organisms is crucial for their survival in fluctuating nutrient environments. Because these organisms must be able to respond reliably to ever varying and unpredictable nutritional conditions, their nutrient signaling networks must have a certain inbuilt robustness. In eukaryotes, such as the budding yeast Saccharomyces cerevisiae, distinct nutrient signals are relayed by specific plasma membrane receptors to signal transduction pathways that are interconnected in complex information-processing networks, which have been well characterized. However, the complexity of the signaling network confounds the interpretation of the overall regulatory “logic” of the control system. Here, we propose a literature-curated molecular mechanism of the integrated nutrient signaling network in budding yeast, focusing on early temporal responses to carbon and nitrogen signaling. We build a computational model of this network to reconcile literature-curated quantitative experimental data with our proposed molecular mechanism. We evaluate the robustness of our estimates of the model’s kinetic parameter values. We test the model by comparing predictions made in mutant strains with qualitative experimental observations made in the same strains. Finally, we use the model to predict nutrient-responsive transcription factor activities in a number of mutant strains undergoing complex nutrient shifts.
more » « less
Full Text Available
Gene regulatory network inference in single-cell biology

https://doi.org/10.1016/j.coisb.2021.04.007

Akers, Kyle; Murali, T.M. (June 2021, Current Opinion in Systems Biology)
null (Ed.)
Full Text Available
Genetic interactions derived from high-throughput phenotyping of 6589 yeast cell cycle mutants

https://doi.org/10.1038/s41540-020-0134-z

Gallegos, Jenna E.; Adames, Neil R.; Rogers, Mark F.; Kraikivski, Pavel; Ibele, Aubrey; Nurzynski-Loth, Kevin; Kudlow, Eric; Murali, T. M.; Tyson, John J.; Peccoud, Jean (December 2020, npj Systems Biology and Applications)

Full Text Available
Protein sequence models for prediction and comparative analysis of the SARS-CoV-2 —human interactome

https://doi.org/10.1142/9789811232701_0015

Kshirsagar, Meghana; Tasnina, Nure; Ward, Michael D.; Law, Jeffrey N.; Murali, T. M.; Lavista Ferres, Juan M.; Bowman, Gregory R.; Klein-Seetharaman, Judith (November 2020, Pacific Symposium of Biocomputing)
null (Ed.)
Viruses such as the novel coronavirus, SARS-CoV-2, that is wreaking havoc on the world, depend on interactions of its own proteins with those of the human host cells. Relatively small changes in sequence such as between SARS-CoV and SARS-CoV-2 can dramatically change clinical phenotypes of the virus, including transmission rates and severity of the disease. On the other hand, highly dissimilar virus families such as Coronaviridae, Ebola, and HIV have overlap in functions. In this work we aim to analyze the role of protein sequence in the binding of SARS-CoV-2 virus proteins towards human proteins and compare it to that of the above other viruses. We build supervised machine learning models, using Generalized Additive Models to predict interactions based on sequence features and find that our models perform well with an AUC-PR of 0.65 in a class-skew of 1:10. Analysis of the novel predictions using an independent dataset showed statistically significant enrichment. We further map the importance of specific amino-acid sequence features in predicting binding and summarize what combinations of sequences from the virus and the host is correlated with an interaction. By analyzing the sequence-based embeddings of the interactomes from different viruses and clustering them together we find some functionally similar proteins from different viruses. For example, vif protein from HIV-1, vp24 from Ebola and orf3b from SARS-CoV all function as interferon antagonists. Furthermore, we can differentiate the functions of similar viruses, for example orf3a’s interactions are more diverged than orf7b interactions when comparing SARS-CoV and SARS-CoV-2.
more » « less
Full Text Available
Accurate and efficient gene function prediction using a multi-bacterial network

https://doi.org/10.1093/bioinformatics/btaa885

Law, Jeffrey N; Kale, Shiv D; Murali, T M (October 2020, Bioinformatics)
Lenore, Cowen (Ed.)
Abstract Motivation Nearly 40% of the genes in sequenced genomes have no experimentally or computationally derived functional annotations. To fill this gap, we seek to develop methods for network-based gene function prediction that can integrate heterogeneous data for multiple species with experimentally based functional annotations and systematically transfer them to newly sequenced organisms on a genome-wide scale. However, the large sizes of such networks pose a challenge for the scalability of current methods. Results We develop a label propagation algorithm called FastSinkSource. By formally bounding its rate of progress, we decrease the running time by a factor of 100 without sacrificing accuracy. We systematically evaluate many approaches to construct multi-species bacterial networks and apply FastSinkSource and other state-of-the-art methods to these networks. We find that the most accurate and efficient approach is to pre-compute annotation scores for species with experimental annotations, and then to transfer them to other organisms. In this manner, FastSinkSource runs in under 3 min for 200 bacterial species. Availability and implementation An implementation of our framework and all data used in this research are available at https://github.com/Murali-group/multi-species-GOA-prediction. Supplementary information Supplementary data are available at Bioinformatics online.
more » « less
Full Text Available
Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data

https://doi.org/10.1038/s41592-019-0690-6

Pratapa, Aditya; Jalihal, Amogh P.; Law, Jeffrey N.; Bharadwaj, Aditya; Murali, T. M. (February 2020, Nature Methods)

Full Text Available

Search for: All records