skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: ProteinWeaver: A Webtool to Visualize Ontology-Annotated Protein Networks
Molecular interaction networks are a vital tool for studying biological systems. While many tools exist that visualize a protein or a pathway within a network, no tool provides the ability for a researcher to consider a protein's position in a network in the context of a specific biological process or pathway. We developed ProteinWeaver, a web-based tool designed to visualize and analyze non-human protein interaction networks by integrating known biological functions. ProteinWeaver provides users with an intuitive interface to situate a user-specified protein in a user-provided biological context (as a Gene Ontology term) in five model organisms. ProteinWeaver also reports the presence of physical and regulatory network motifs within the queried subnetwork and statistics about the protein's distance to the biological process or pathway within the network. These insights can help researchers generate testable hypotheses about the protein's potential role in the process or pathway under study. Two cell biology case studies demonstrate ProteinWeaver's potential to generate hypotheses from the queried subnetworks. ProteinWeaver is available at https://proteinweaver.reedcompbio.org/.  more » « less
Award ID(s):
1750981
PAR ID:
10621902
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
bioRxiv
Date Published:
Format(s):
Medium: X
Institution:
bioRxiv
Sponsoring Org:
National Science Foundation
More Like this
  1. Shao, Mingfu (Ed.)
    Graphs are powerful tools for modeling and analyzing molecular interaction networks. Graphs typically represent either undirected physical interactions or directed regulatory relationships, which can obscure a particular protein’s functional context. Graphlets can describe local topologies and patterns within graphs, and combining physical and regulatory interactions offer new graphlet configurations that can provide biological insights. We present GRPhIN, a tool for characterizing graphlets and protein roles within graphlets in mixed physical and regulatory interaction networks. We describe the graphlets of mixed networks in B. subtilis, C. elegans, D. melanogaster, D. rerio, and S. cerevisiae and examine local topologies of proteins and subnetworks related to the oxidative stress response pathway. We found a number of graphlets that were abundant in all species, specific node positions (orbits) within graphlets that were over-represented in stress-associated proteins, and rarely-occurring graphlets that were over-represented in oxidative stress subnetworks. These results showcase the potential for using graphlets in mixed physical and regulatory interaction networks to identify new patterns beyond a single interaction type. 
    more » « less
  2. Abstract The Soybean Gene Atlas project provides a comprehensive map for understanding gene expression patterns in major soybean tissues from flower, root, leaf, nodule, seed, and shoot and stem. The RNA‐Seq data generated in the project serve as a valuable resource for discovering tissue‐specific transcriptome behavior of soybean genes in different tissues. We developed a computational pipeline for Soybean context‐specific network (SoyCSN) inference with a suite of prediction tools to analyze, annotate, retrieve, and visualize soybean context‐specific networks at both transcriptome and interactome levels. BicMix and Cross‐Conditions Cluster Detection algorithms were applied to detect modules based on co‐expression relationships across all the tissues. Soybean context‐specific interactomes were predicted by combining soybean tissue gene expression and protein–protein interaction data. Functional analyses of these predicted networks provide insights into soybean tissue specificities. For example, under symbiotic, nitrogen‐fixing conditions, the constructed soybean leaf network highlights the connection between the photosynthesis function and rhizobium–legume symbiosis. SoyCSN data and all its results are publicly available via an interactive web service within the Soybean Knowledge Base (SoyKB) athttp://soykb.org/SoyCSN. SoyCSN provides a useful web‐based access for exploring context specificities systematically in gene regulatory mechanisms and gene relationships for soybean researchers and molecular breeders. 
    more » « less
  3. Abstract The growing complexity of biological data has spurred the development of innovative computational techniques to extract meaningful information and uncover hidden patterns within vast datasets. Biological networks, such as gene regulatory networks and protein-protein interaction networks, hold critical insights into biological features’ connections and functions. Integrating and analyzing high-dimensional data, particularly in gene expression studies, stands prominent among the challenges in deciphering these networks. Clustering methods play a crucial role in addressing these challenges, with spectral clustering emerging as a potent unsupervised technique considering intrinsic geometric structures. However, spectral clustering’s user-defined cluster number can lead to inconsistent and sometimes orthogonal clustering regimes. We propose theMulti-layer Bundling (MLB)method to address this limitation, combining multiple prominent clustering regimes to offer a comprehensive data view. We call the outcome clusters “bundles”. This approach refines clustering outcomes, unravels hierarchical organization, and identifies bridge elements mediating communication between network components. By layering clustering results, MLB provides a global-to-local view of biological feature clusters enabling insights into intricate biological systems. Furthermore, the method enhances bundle network predictions by integrating thebundle co-cluster matrixwith the affinity matrix. The versatility of MLB extends beyond biological networks, making it applicable to various domains where understanding complex relationships and patterns is needed. 
    more » « less
  4. Clifford Whitcomb (Ed.)
    Analyzing interactions between actors from a systems perspective yields valuable information about the overall system's form and function. When this is coupled with ecological modeling and analysis techniques, biological inspiration can also be applied to these systems. The diagnostic value of three metrics frequently used to study mutualistic biological ecosystems (nestedness, modularity, and connectance) is shown here using academic engineering makerspaces. Engineering students get hands‐on usage experience with tools for personal, class, and competition‐based projects in these spaces. COVID‐19 provides a unique study of university makerspaces, enabling the analysis of makerspace health through the known disturbance and resultant regulatory changes (implementation and return to normal operations). Nestedness, modularity, and connectance are shown to provide information on space functioning in a way that enables them to serve as heuristic diagnostics tools for system conditions. The makerspaces at two large R1 universities are analyzed across multiple semesters by modeling them as bipartite student‐tool interaction networks. The results visualize the predictive ability of these metrics, finding that the makerspaces tended to be structurally nested in any one semester, however when compared to a “normal” semester the restrictions are reflected via a higher modularity. The makerspace network case studies provide insight into the use and value of quantitative ecosystem structure and function indicators for monitoring similar human‐engineered interaction networks that are normally only tracked qualitatively. 
    more » « less
  5. Mura, Cameron (Ed.)
    Machine learning (ML) is increasingly being used to guide biological discovery in biomedicine such as prioritizing promising small molecules in drug discovery. In those applications, ML models are used to predict the properties of biological systems, and researchers use these predictions to prioritize candidates as new biological hypotheses for downstream experimental validations. However, when applied to unseen situations, these models can be overconfident and produce a large number of false positives. One solution to address this issue is to quantify the model’s prediction uncertainty and provide a set of hypotheses with a controlled false discovery rate (FDR) pre-specified by researchers. We propose CPEC, an ML framework for FDR-controlled biological discovery. We demonstrate its effectiveness using enzyme function annotation as a case study, simulating the discovery process of identifying the functions of less-characterized enzymes. CPEC integrates a deep learning model with a statistical tool known as conformal prediction, providing accurate and FDR-controlled function predictions for a given protein enzyme. Conformal prediction provides rigorous statistical guarantees to the predictive model and ensures that the expected FDR will not exceed a user-specified level with high probability. Evaluation experiments show that CPEC achieves reliable FDR control, better or comparable prediction performance at a lower FDR than existing methods, and accurate predictions for enzymes under-represented in the training data. We expect CPEC to be a useful tool for biological discovery applications where a high yield rate in validation experiments is desired but the experimental budget is limited. 
    more » « less