Life science data analysis workflow development using the bioextract server leveraging the iPlant collaborative cyberinfrastructure

Lushbough, Carol M.; Gnimpieba, Etienne Z.; Dooley, Rion

doi:10.1002/cpe.3237

Summary

In order to handle the vast quantities of biological data gener6ated by high‐throughput experimental technologies, the BioExtract Server (bioextract.org) has leveraged iPlant Collaborative (www.iplantcollaborative.org) functionality to help address big data storage and analysis issues in the bioinformatics field. The BioExtract Server is a Web‐based, workflow‐enabling system that offers researchers a flexible environment for analyzing genomic data. It provides researchers with the ability to save a series of BioExtract Server tasks (e.g., query a data source, save a data extract, and execute an analytic tool) as a workflow and the opportunity for researchers to share their data extracts, analytic tools, and workflows with collaborators. The iPlant Collaborative is a community of researchers, educators, and students working to enrich science through the development of cyberinfrastructure—the physical computing resources, collaborative environment, virtual machine resources, and interoperable analysis software and data services—that are essential components of modern biology. The iPlant AGAVE Advanced Programming Interface, developed through the iPlant Collaborative, is a hosted, Software‐as‐a‐Service resource providing access to a collection of high performance computing and cloud resources. Leveraging AGAVE, the BioExtract Server gives researchers easy access to multiple high performance computers and delivers computation and storage as dynamically allocated resources via the Internet. © 2014 The Authors.Concurrency and Computation: Practice and Experiencepublished by John Wiley & Sons Ltd.

More Like this