NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

RASMA: a reverse search algorithm for mining maximal frequent subgraphs

https://doi.org/10.1186/s13040-021-00250-1

Salem, Saeed; Alokshiya, Mohammed; Hasan, Mohammad Al (December 2021, BioData Mining)

Abstract Background Given a collection of coexpression networks over a set of genes, identifying subnetworks that appear frequently is an important research problem known as mining frequent subgraphs. Maximal frequent subgraphs are a representative set of frequent subgraphs; A frequent subgraph is maximal if it does not have a super-graph that is frequent. In the bioinformatics discipline, methodologies for mining frequent and/or maximal frequent subgraphs can be used to discover interesting network motifs that elucidate complex interactions among genes, reflected through the edges of the frequent subnetworks. Further study of frequent coexpression subnetworks enhances the discovery of biological modules and biological signatures for gene expression and disease classification. Results We propose a reverse search algorithm, called RASMA, for mining frequent and maximal frequent subgraphs in a given collection of graphs. A key innovation in RASMA is a connected subgraph enumerator that uses a reverse-search strategy to enumerate connected subgraphs of an undirected graph. Using this enumeration strategy, RASMA obtains all maximal frequent subgraphs very efficiently. To overcome the computationally prohibitive task of enumerating all frequent subgraphs while mining for the maximal frequent subgraphs, RASMA employs several pruning strategies that substantially improve its overall runtime performance. Experimental results show that on large gene coexpression networks, the proposed algorithm efficiently mines biologically relevant maximal frequent subgraphs. Conclusion Extracting recurrent gene coexpression subnetworks from multiple gene expression experiments enables the discovery of functional modules and subnetwork biomarkers. We have proposed a reverse search algorithm for mining maximal frequent subnetworks. Enrichment analysis of the extracted maximal frequent subnetworks reveals that subnetworks that are frequent are highly enriched with known biological ontologies.
more » « less
Full Text Available
Efficiently mining rich subgraphs from vertex-attributed graphs

https://doi.org/10.1145/3388440.3412423

Hakim, Riyad; Salem, Saeed (September 2020, BCB '20: Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics)
null (Ed.)
Full Text Available
Post-Processing Summarization for Mining Frequent Dense Subnetworks

https://doi.org/10.1145/3388440.3415989

Seo, San Ha; Salem, Saeed (September 2020, BCB '20: Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics)
null (Ed.)
Full Text Available
Mining representative approximate frequent coexpression subnetworks

https://doi.org/10.1145/3388440.3415584

Seo, San Ha; Salem, Saeed (September 2020, BCB '20: Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics)
null (Ed.)
Full Text Available
Mining approximate frequent dense modules from multiple gene expression datasets

https://doi.org/10.29007/d87q

Seo, San Ha; Salem, Saeed (January 2020, EPiC Series in Computing)

Large amount of gene expression data has been collected for various environmental and biological conditions. Extracting co-expression networks that are recurrent in multiple co-expression networks has been shown promising in functional gene annotation and biomarkers discovery. Frequent subgraph mining reports a large number of subnetworks. In this work, we propose to mine approximate dense frequent subgraphs. Our proposed approach reports representative frequent subgraphs that are also dense. Our experiments on real gene coexpression networks show that frequent subgraphs are biologically interesting as evidenced by the large percentage of biologically enriched frequent dense subgraphs.
more » « less
Full Text Available

Search for: All records