NSF PAR Search | NSF Public Access Repository

Twister2 Cross‐platform resource scheduler for big data

https://doi.org/10.1002/cpe.6502

Uyar, Ahmet; Gunduz, Gurhan; Kamburugamuve, Supun; Wickramasinghe, Pulasthi; Widanage, Chathura; Govindarajan, Kannan; Perera, Niranda; Abeykoon, Vibhatha; Akkas, Selahattin; Fox, Geoffrey (July 2021, Concurrency and Computation: Practice and Experience)

Abstract Twister2 is an open‐source big data hosting environment designed to process both batch and streaming data at scale. Twister2 runs jobs in both high‐performance computing (HPC) and big data clusters. It provides a cross‐platform resource scheduler to run jobs in diverse environments. Twister2 is designed with a layered architecture to support various clusters and big data problems. In this paper, we present the cross‐platform resource scheduler of Twister2. We identify required services and explain implementation details. We present job startup delays for single jobs and multiple concurrent jobs in Kubernetes and OpenMPI clusters. We compare job startup delays for Twister2 and Spark at a Kubernetes cluster. In addition, we compare the performance of terasort algorithm on Kubernetes and bare metal clusters at AWS cloud.

Search for: All records