Parallel Framework for Data-Intensive Computing with XSEDE

Subramanian, Ranjini; Zhang, Hui

doi:10.1145/3332186.3338097

Citation Details

Parallel Framework for Data-Intensive Computing with XSEDE

With the increase in data-driven analytics, the demand for high performing computing resources has risen. There are many high-performance computing centers providing cyberinfrastructure (CI) for academic research. However, there exists access barriers in bringing these resources to a broad range of users. Users who are new to data analytics field are not yet equipped to take advantage of the tools offered by CI. In this paper, we propose a framework to lower the access barriers that exist in bringing the high-performance computing resources to users that do not have the training to utilize the capability of CI. The framework uses divide-and-conquer (DC) paradigm for data-intensive computing tasks. It consists of three major components - user interface (UI), parallel scripts generator (PSG) and underlying cyberinfrastructure (CI). The goal of the framework is to provide a user-friendly method for parallelizing data-intensive computing tasks with minimal user intervention. Some of the key design goals are usability, scalability and reproducibility. The users can focus on their problem and leave the parallelization details to the framework. more »

Award ID(s):: 1726532

NSF-PAR ID:: 10107981

Author(s) / Creator(s):: Subramanian, Ranjini; Zhang, Hui

Date Published:: 2019-07-20

Journal Name:: PEARC '19 Proceedings of the Practice and Experience in Advanced Research Computing on Rise of the Machines (learning)

Page Range / eLocation ID:: 1 to 8

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3332186.3338097

More Like this