skip to main content


Title: Association study based on topological constraints of protein–protein interaction networks
Abstract

The non-random interaction pattern of a protein–protein interaction network (PIN) is biologically informative, but its potentials have not been fully utilized in omics studies. Here, we propose a network-permutation-based association study (NetPAS) method that gauges the observed interactions between two sets of genes based on the comparison between permutation null models and the empirical networks. This enables NetPAS to evaluate relationships, constrained by network topology, between gene sets related to different phenotypes. We demonstrated the utility of NetPAS in 50 well-curated gene sets and comparison of association studies using Z-scores, modified Zʹ-scores, p-values and Jaccard indices. Using NetPAS, a weighted human disease network was generated from the association scores of 19 gene sets from OMIM. We also applied NetPAS in gene sets derived from gene ontology and pathway annotations and showed that NetPAS uncovered functional terms missed by DAVID and WebGestalt. Overall, we show that NetPAS can take topological constraints of molecular networks into account and offer new perspectives than existing methods.

 
more » « less
Award ID(s):
1761839 1720215
NSF-PAR ID:
10166915
Author(s) / Creator(s):
;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Scientific Reports
Volume:
10
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Martelli, Pier Luigi (Ed.)
    Abstract Motivation Transferring knowledge between species is challenging: different species contain distinct proteomes and cellular architectures, which cause their proteins to carry out different functions via different interaction networks. Many approaches to protein functional annotation use sequence similarity to transfer knowledge between species. These approaches cannot produce accurate predictions for proteins without homologues of known function, as many functions require cellular context for meaningful prediction. To supply this context, network-based methods use protein-protein interaction (PPI) networks as a source of information for inferring protein function and have demonstrated promising results in function prediction. However, most of these methods are tied to a network for a single species, and many species lack biological networks. Results In this work, we integrate sequence and network information across multiple species by computing IsoRank similarity scores to create a meta-network profile of the proteins of multiple species. We use this integrated multispecies meta-network as input to train a maxout neural network with Gene Ontology terms as target labels. Our multispecies approach takes advantage of more training examples, and consequently leads to significant improvements in function prediction performance compared to two network-based methods, a deep learning sequence-based method and the BLAST annotation method used in the Critial Assessment of Functional Annotation. We are able to demonstrate that our approach performs well even in cases where a species has no network information available: when an organism’s PPI network is left out we can use our multi-species method to make predictions for the left-out organism with good performance. Availability and implementation The code is freely available at https://github.com/nowittynamesleft/NetQuilt. The data, including sequences, PPI networks and GO annotations are available at https://string-db.org/. Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  2. Abstract

    Convergent research identifies a general factor (“P factor”) that confers transdiagnostic risk for psychopathology. Large-scale networks are key organizational units of the human brain. However, studies of altered network connectivity patterns associated with the P factor are limited, especially in early adolescence when most mental disorders are first emerging. We studied 11,875 9- and 10-year olds from the Adolescent Brain and Cognitive Development (ABCD) study, of whom 6593 had high-quality resting-state scans. Network contingency analysis was used to identify altered interconnections associated with the P factor among 16 large-scale networks. These connectivity changes were then further characterized with quadrant analysis that quantified the directionality of P factor effects in relation to neurotypical patterns of positive versus negative connectivity across connections. The results showed that the P factor was associated with altered connectivity across 28 network cells (i.e., sets of connections linking pairs of networks);pPERMUTATIONvalues < 0.05 FDR-corrected for multiple comparisons. Higher P factor scores were associated with hypoconnectivity within default network and hyperconnectivity between default network and multiple control networks. Among connections within these 28 significant cells, the P factor was predominantly associated with “attenuating” effects (67%;pPERMUTATION < 0.0002), i.e., reduced connectivity at neurotypically positive connections and increased connectivity at neurotypically negative connections. These results demonstrate that the general factor of psychopathology produces attenuating changes across multiple networks including default network, involved in spontaneous responses, and control networks involved in cognitive control. Moreover, they clarify mechanisms of transdiagnostic risk for psychopathology and invite further research into developmental causes of distributed attenuated connectivity.

     
    more » « less
  3. Abstract

    Recent genome-wide studies have begun to identify gene variants, expression profiles, and regulators associated with neuroticism, anxiety disorders, and depression. We conducted a set of experimental cell culture studies of gene regulation by micro RNAs (miRNAs), based on genome-wide transcriptome, proteome, and miRNA expression data from twentypostmortemsamples of lateral amygdala from donors with known neuroticism scores. Using Ingenuity Pathway Analysis and TargetScan, we identified a list of mRNA–protein–miRNA sets whose expression patterns were consistent with miRNA-based translational repression, as a function of trait anxiety. Here, we focused on one gene from that list, which is of particular translational significance in Psychiatry: synaptic vesicle glycoprotein 2A (SV2A) is the binding site of the anticonvulsant drug levetiracetam ((S)-α-Ethyl-2-oxo-1-pyrrolidineacetamide), which has shown promise in anxiety disorder treatments. We confirmed thatSV2Ais associated with neuroticism or anxiety using an original GWAS of a community cohort (N = 1,706), and cross-referencing a published GWAS of multiple cohorts (Ns ranging from 340,569 to 390,278).Postmortemamygdala expression profiling implicated three putative regulatory miRNAs to targetSV2A: miR-133a, miR-138, and miR-218. Moving from association to experimental causal testing in cell culture, we used a luciferase assay to demonstrate that miR-133a and miR-218, but not miR-138, significantly decreased relative luciferase activity from theSV2Adual-luciferase construct. In human neuroblastoma cells, transfection with miR-133a and miR-218 reduced both endogenousSV2AmRNA and protein levels, confirming miRNA targeting of theSV2Agene. This study illustrates the utility of combiningpostmortemgene expression data with GWAS to guide experimental cell culture assays examining gene regulatory mechanisms that may contribute to complex human traits. Identifying specific molecular mechanisms of gene regulation may be useful for future clinical applications in anxiety disorders or other forms of psychopathology.

     
    more » « less
  4. Abstract Motivation

    Protein function prediction, based on the patterns of connection in a protein–protein interaction (or association) network, is perhaps the most studied of the classical, fundamental inference problems for biological networks. A highly successful set of recent approaches use random walk-based low-dimensional embeddings that tend to place functionally similar proteins into coherent spatial regions. However, these approaches lose valuable local graph structure from the network when considering only the embedding. We introduce GLIDER, a method that replaces a protein–protein interaction or association network with a new graph-based similarity network. GLIDER is based on a variant of our previous GLIDE method, which was designed to predict missing links in protein–protein association networks, capturing implicit local and global (i.e. embedding-based) graph properties.

    Results

    GLIDER outperforms competing methods on the task of predicting GO functional labels in cross-validation on a heterogeneous collection of four human protein–protein association networks derived from the 2016 DREAM Disease Module Identification Challenge, and also on three different protein–protein association networks built from the STRING database. We show that this is due to the strong functional enrichment that is present in the local GLIDER neighborhood in multiple different types of protein–protein association networks. Furthermore, we introduce the GLIDER graph neighborhood as a way for biologists to visualize the local neighborhood of a disease gene. As an application, we look at the local GLIDER neighborhoods of a set of known Parkinson’s Disease GWAS genes, rediscover many genes which have known involvement in Parkinson’s disease pathways, plus suggest some new genes to study.

    Availability and implementation

    All code is publicly available and can be accessed here: https://github.com/kap-devkota/GLIDER.

    Supplementary information

    Supplementary data are available at Bioinformatics online.

     
    more » « less
  5. Abstract Background Identification of genes responsible for anatomical entities is a major requirement in many fields including developmental biology, medicine, and agriculture. Current wet lab techniques used for this purpose, such as gene knockout, are high in resource and time consumption. Protein–protein interaction (PPI) networks are frequently used to predict disease genes for humans and gene candidates for molecular functions, but they are rarely used to predict genes for anatomical entities. Moreover, PPI networks suffer from network quality issues, which can be a limitation for their usage in predicting candidate genes. Therefore, we developed an integrative framework to improve the candidate gene prediction accuracy for anatomical entities by combining existing experimental knowledge about gene-anatomical entity relationships with PPI networks using anatomy ontology annotations. We hypothesized that this integration improves the quality of the PPI networks by reducing the number of false positive and false negative interactions and is better optimized to predict candidate genes for anatomical entities. We used existing Uberon anatomical entity annotations for zebrafish and mouse genes to construct gene networks by calculating semantic similarity between the genes. These anatomy-based gene networks were semantic networks, as they were constructed based on the anatomy ontology annotations that were obtained from the experimental data in the literature. We integrated these anatomy-based gene networks with mouse and zebrafish PPI networks retrieved from the STRING database and compared the performance of their network-based candidate gene predictions. Results According to evaluations of candidate gene prediction performance tested under four different semantic similarity calculation methods (Lin, Resnik, Schlicker, and Wang), the integrated networks, which were semantically improved PPI networks, showed better performances by having higher area under the curve values for receiver operating characteristic and precision-recall curves than PPI networks for both zebrafish and mouse. Conclusion Integration of existing experimental knowledge about gene-anatomical entity relationships with PPI networks via anatomy ontology improved the candidate gene prediction accuracy and optimized them for predicting candidate genes for anatomical entities. 
    more » « less