Title: PhyloWGA : chromosome-aware phylogenetic interrogation of whole genome alignments
Abstract Summary Here, we present PhyloWGA, an open source R package for conducting phylogenetic analysis and investigation of whole genome data. Availabilityand implementation Available at Github (https://github.com/radamsRHA/PhyloWGA). Supplementary information Supplementary data are available at Bioinformatics online. more »« less
Abstract Summary We present StochSS Live!, a web-based service for modeling, simulation and analysis of a wide range of mathematical, biological and biochemical systems. Using an epidemiological model of COVID-19, we demonstrate the power of StochSS Live! to enable researchers to quickly develop a deterministic or a discrete stochastic model, infer its parameters and analyze the results. Availability and implementation StochSS Live! is freely available at https://live.stochss.org/ Supplementary information Supplementary data are available at Bioinformatics online.
Abstract Summary ProDy, an integrated application programming interface developed for modelling and analysing protein dynamics, has significantly evolved in recent years in response to the growing data and needs of the computational biology community. We present major developments that led to ProDy 2.0: (i) improved interfacing with databases and parsing new file formats, (ii) SignDy for signature dynamics of protein families, (iii) CryoDy for collective dynamics of supramolecular systems using cryo-EM density maps and (iv) essential site scanning analysis for identifying sites essential to modulating global dynamics. Availability and implementation ProDy is open-source and freely available under MIT License from https://github.com/prody/ProDy. Supplementary information Supplementary data are available at Bioinformatics online.
Abstract MotivationAccurately representing biological networks in a low-dimensional space, also known as network embedding, is a critical step in network-based machine learning and is carried out widely using node2vec, an unsupervised method based on biased random walks. However, while many networks, including functional gene interaction networks, are dense, weighted graphs, node2vec is fundamentally limited in its ability to use edge weights during the biased random walk generation process, thus under-using all the information in the network. ResultsHere, we present node2vec+, a natural extension of node2vec that accounts for edge weights when calculating walk biases and reduces to node2vec in the cases of unweighted graphs or unbiased walks. Using two synthetic datasets, we empirically show that node2vec+ is more robust to additive noise than node2vec in weighted graphs. Then, using genome-scale functional gene networks to solve a wide range of gene function and disease prediction tasks, we demonstrate the superior performance of node2vec+ over node2vec in the case of weighted graphs. Notably, due to the limited amount of training data in the gene classification tasks, graph neural networks such as GCN and GraphSAGE are outperformed by both node2vec and node2vec+. Availability and implementationThe data and code are available on GitHub at https://github.com/krishnanlab/node2vecplus_benchmarks. All additional data underlying this article are available on Zenodo at https://doi.org/10.5281/zenodo.7007164. Supplementary informationSupplementary data are available at Bioinformatics online.
Abstract SummaryBioinformatics applications increasingly rely on ad hoc disk storage of k-mer sets, e.g. for de Bruijn graphs or alignment indexes. Here, we introduce the K-mer File Format as a general lossless framework for storing and manipulating k-mer sets, realizing space savings of 3–5× compared to other formats, and bringing interoperability across tools. Availability and implementationFormat specification, C++/Rust API, tools: https://github.com/Kmer-File-Format/. Supplementary informationSupplementary data are available at Bioinformatics online.
Marco-Sola, Santiago; Eizenga, Jordan M; Guarracino, Andrea; Paten, Benedict; Garrison, Erik; Moreto, Miquel
(, Bioinformatics)
Martelli, Pier Luigi
(Ed.)
Abstract MotivationPairwise sequence alignment remains a fundamental problem in computational biology and bioinformatics. Recent advances in genomics and sequencing technologies demand faster and scalable algorithms that can cope with the ever-increasing sequence lengths. Classical pairwise alignment algorithms based on dynamic programming are strongly limited by quadratic requirements in time and memory. The recently proposed wavefront alignment algorithm (WFA) introduced an efficient algorithm to perform exact gap-affine alignment in O(ns) time, where s is the optimal score and n is the sequence length. Notwithstanding these bounds, WFA’s O(s2) memory requirements become computationally impractical for genome-scale alignments, leading to a need for further improvement. ResultsIn this article, we present the bidirectional WFA algorithm, the first gap-affine algorithm capable of computing optimal alignments in O(s) memory while retaining WFA’s time complexity of O(ns). As a result, this work improves the lowest known memory bound O(n) to compute gap-affine alignments. In practice, our implementation never requires more than a few hundred MBs aligning noisy Oxford Nanopore Technologies reads up to 1 Mbp long while maintaining competitive execution times. Availability and implementationAll code is publicly available at https://github.com/smarco/BiWFA-paper. Supplementary informationSupplementary data are available at Bioinformatics online.
Adams, Richard H, Castoe, Todd A, and DeGiorgio, Michael. PhyloWGA : chromosome-aware phylogenetic interrogation of whole genome alignments. Retrieved from https://par.nsf.gov/biblio/10213825. Bioinformatics . Web. doi:10.1093/bioinformatics/btaa884.
@article{osti_10213825,
place = {Country unknown/Code not available},
title = {PhyloWGA : chromosome-aware phylogenetic interrogation of whole genome alignments},
url = {https://par.nsf.gov/biblio/10213825},
DOI = {10.1093/bioinformatics/btaa884},
abstractNote = {Abstract Summary Here, we present PhyloWGA, an open source R package for conducting phylogenetic analysis and investigation of whole genome data. Availabilityand implementation Available at Github (https://github.com/radamsRHA/PhyloWGA). Supplementary information Supplementary data are available at Bioinformatics online.},
journal = {Bioinformatics},
author = {Adams, Richard H and Castoe, Todd A and DeGiorgio, Michael},
editor = {Ponty, Yann}
}
Warning: Leaving National Science Foundation Website
You are now leaving the National Science Foundation website to go to a non-government website.
Website:
NSF takes no responsibility for and exercises no control over the views expressed or the accuracy of
the information contained on this site. Also be aware that NSF's privacy policy does not apply to this site.