skip to main content


Title: Combining spectral clustering and large cut algorithms to find compensatory functional modules from yeast physical and genetic interaction data with GLASS
Various algorithmic and statistical approaches have been proposed to uncover functionally coherent network motifs consisting of sets of genes that may occur as compensatory pathways (called Between Pathway Modules, or BPMs) in a high-throughput S. Cerevisiae genetic interaction network. We extend our previous Local-Cut/Genecentric method to also make use of a spectral clustering of the physical interaction network, and uncover some interesting new fault-tolerant modules.  more » « less
Award ID(s):
1934553
NSF-PAR ID:
10350061
Author(s) / Creator(s):
;
Date Published:
Journal Name:
BCB '22: Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
Page Range / eLocation ID:
1 to 4
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Identifying genes that interact to confer a biological function to an organism is one of the main goals of functional genomics. High‐throughput technologies for assessment and quantification of genome‐wide gene expression patterns have enabled systems‐level analyses to infer pathways or networks of genes involved in different functions under many different conditions. Here, we leveraged the publicly available, information‐rich RNA‐Seq datasets of the model plantArabidopsis thalianato construct a gene co‐expression network, which was partitioned into clusters or modules that harbor genes correlated by expression. Gene ontology and pathway enrichment analyses were performed to assess functional terms and pathways that were enriched within the different gene modules. By interrogating the co‐expression network for genes in different modules that associate with a gene of interest, diverse functional roles of the gene can be deciphered. By mapping genes differentially expressing under a certain condition inArabidopsisonto the co‐expression network, we demonstrate the ability of the network to uncover novel genes that are likely transcriptionally active but prone to be missed by standard statistical approaches due to their falling outside of the confidence zone of detection. To our knowledge, this is the firstA. thalianaco‐expression network constructed using the entire mRNA‐Seq datasets (>20,000) available at the NCBI SRA database. The developed network can serve as a useful resource for theArabidopsisresearch community to interrogate specific genes of interest within the network, retrieve the respective interactomes, decipher gene modules that are transcriptionally altered under certain condition or stage, and gain understanding of gene functions.

     
    more » « less
  2. The structure of neural circuitry plays a crucial role in brain function. Previous studies of brain organization generally had to trade off between coarse descriptions at a large scale and fine descriptions on a small scale. Researchers have now reconstructed tens to hundreds of thousands of neurons at synaptic resolution, enabling investigations into the interplay between global, modular organization, and cell type-specific wiring. Analyzing data of this scale, however, presents unique challenges. To address this problem, we applied novel community detection methods to analyze the synapse-level reconstruction of an adult femaleDrosophila melanogasterbrain containing >20,000 neurons and 10 million synapses. Using a machine-learning algorithm, we find the most densely connected communities of neurons by maximizing a generalized modularity density measure. We resolve the community structure at a range of scales, from large (on the order of thousands of neurons) to small (on the order of tens of neurons). We find that the network is organized hierarchically, and larger-scale communities are composed of smaller-scale structures. Our methods identify well-known features of the fly brain, including its sensory pathways. Moreover, focusing on specific brain regions, we are able to identify subnetworks with distinct connectivity types. For example, manual efforts have identified layered structures in the fan-shaped body. Our methods not only automatically recover this layered structure, but also resolve finer connectivity patterns to downstream and upstream areas. We also find a novel modular organization of the superior neuropil, with distinct clusters of upstream and downstream brain regions dividing the neuropil into several pathways. These methods show that the fine-scale, local network reconstruction made possible by modern experimental methods are sufficiently detailed to identify the organization of the brain across scales, and enable novel predictions about the structure and function of its parts.

    Significance StatementThe Hemibrain is a partial connectome of an adult femaleDrosophila melanogasterbrain containing >20,000 neurons and 10 million synapses. Analyzing the structure of a network of this size requires novel and efficient computational tools. We applied a new community detection method to automatically uncover the modular structure in the Hemibrain dataset by maximizing a generalized modularity measure. This allowed us to resolve the community structure of the fly hemibrain at a range of spatial scales revealing a hierarchical organization of the network, where larger-scale modules are composed of smaller-scale structures. The method also allowed us to identify subnetworks with distinct cell and connectivity structures, such as the layered structures in the fan-shaped body, and the modular organization of the superior neuropil. Thus, network analysis methods can be adopted to the connectomes being reconstructed using modern experimental methods to reveal the organization of the brain across scales. This supports the view that such connectomes will allow us to uncover the organizational structure of the brain, which can ultimately lead to a better understanding of its function.

     
    more » « less
  3. Nitrogen (N) and Water (W) - two resources critical for crop productivity – are becoming increasingly limited in soils globally. To address this issue, we aim to uncover the gene regulatory networks (GRNs) that regulate nitrogen use efficiency (NUE) - as a function of water availability - in Oryza sativa, a staple for 3.5 billion people. In this study, we infer and validate GRNs that correlate with rice NUE phenotypes affected by N-by-W availability in the field. We did this by exploiting RNA-seq and crop phenotype data from 19 rice varieties grown in a 2x2 N-by-W matrix in the field. First, to identify gene-to-NUE field phenotypes, we analyzed these datasets using weighted gene co-expression network analysis (WGCNA). This identified two network modules ("skyblue" & "grey60") highly correlated with NUE grain yield (NUEg). Next, we focused on 90 TFs contained in these two NUEg modules and predicted their genome-wide targets using the N-and/or-W response datasets using a random forest network inference approach (GENIE3). Next, to validate the GENIE3 TF→target gene predictions, we performed Precision/Recall Analysis (AUPR) using nine datasets for three TFs validated in planta . This analysis sets a precision threshold of 0.31, used to "prune" the GENIE3 network for high-confidence TF→target gene edges, comprising 88 TFs and 5,716 N-and/or-W response genes. Next, we ranked these 88 TFs based on their significant influence on NUEg target genes responsive to N and/or W signaling. This resulted in a list of 18 prioritized TFs that regulate 551 NUEg target genes responsive to N and/or W signals. We validated the direct regulated targets of two of these candidate NUEg TFs in a plant cell-based TF assay called TARGET, for which we also had in planta data for comparison. Gene ontology analysis revealed that 6/18 NUEg TFs - OsbZIP23 (LOC_Os02g52780), Oshox22 (LOC_Os04g45810), LOB39 (LOC_Os03g41330), Oshox13 (LOC_Os03g08960), LOC_Os11g38870, and LOC_Os06g14670 - regulate genes annotated for N and/or W signaling. Our results show that OsbZIP23 and Oshox22, known regulators of drought tolerance, also coordinate W-responses with NUEg. This validated network can aid in developing/breeding rice with improved yield on marginal, low N-input, drought-prone soils. 
    more » « less
  4. Abstract Motivation

    Higher-order interaction patterns among proteins have the potential to reveal mechanisms behind molecular processes and diseases. While clustering methods are used to identify functional groups within molecular interaction networks, these methods largely focus on edge density and do not explicitly take into consideration higher-order interactions. Disease genes in these networks have been shown to exhibit rich higher-order structure in their vicinity, and considering these higher-order interaction patterns in network clustering have the potential to reveal new disease-associated modules.

    Results

    We propose a higher-order community detection method which identifies community structure in networks with respect to specific higher-order connectivity patterns beyond edges. Higher-order community detection on four different protein–protein interaction networks identifies biologically significant modules and disease modules that conventional edge-based clustering methods fail to discover. Higher-order clusters also identify disease modules from genome-wide association study data, including new modules that were not discovered by top-performing approaches in a Disease Module DREAM Challenge. Our approach provides a more comprehensive view of community structure that enables us to predict new disease–gene associations.

    Availability and implementation

    https://github.com/Reed-CompBio/graphlet-clustering.

     
    more » « less
  5. Abstract

    Ecological interactions range from purely specialized to extremely generalized in nature. Recent research has showed very high levels of specialization in the cyanolichens involvingPeltigera(mycobionts) and theirNostocphotosynthetic partners (cyanobionts). Yet, little is known about the mechanisms contributing to the establishment and maintenance of such high specialization levels.

    Here, we characterized interactions betweenPeltigeraandNostocpartners at a global scale, using more than one thousand thalli. We used tools from network theory, community phylogenetics and biogeographical history reconstruction to evaluate how these symbiotic interactions may have evolved.

    After splitting the interaction matrix into modules of preferentially interacting partners, we evaluated how module membership might have evolved along the mycobionts’ phylogeny. We also teased apart the contributions of geographical overlap vs phylogeny in driving interaction establishment betweenPeltigeraandNostoctaxa.

    Module affiliation rarely evolves through the splitting of large ancestral modules. Instead, new modules appear to emerge independently, which is often associated with a fungal speciation event. We also found strong phylogenetic signal in these interactions, which suggests that partner switching is constrained by conserved traits. Therefore, it seems that a high rate of fungal diversification following a switch to a new cyanobiont can lead to the formation of large modules, with cyanobionts associating with multiple closely retatedPeltigeraspecies.

    Finally, when restricting our analyses toPeltigerasister species, the latter differed more through partner acquisition/loss than replacement (i.e., switching). This pattern vanishes as we look at sister species that have diverged longer ago. This suggests that fungal speciation may be accompanied by a stepwise process of (a) novel partner acquisition and (b) loss of the ancestral partner. This could explain the maintenance of high specialization levels in this symbiotic system where the transmission of the cyanobiont to the next generation is assumed to be predominantly horizontal.

    Synthesis.Overall, our study suggests that oscillation between generalization and ancestral partner loss may maintain high specialization within the lichen genusPeltigera, and that partner selection is not only driven by partners’ geographical overlap, but also by their phylogenetically conserved traits.

     
    more » « less