An Experimental Evaluation of Garbage Collectors on Big Data Applications

Xu, Lijie; Guo, Tian; Dou, Wensheng; Wang, Wei; Wei, Jun

Citation Details

Popular big data frameworks, ranging from Hadoop MapReduce to Spark, rely on garbage-collected languages, such as Java and Scala. Big data applications are especially sensitive to the effectiveness of garbage collection (i.e., GC), because they usually process a large volume of data objects that lead to heavy GC overhead. Lacking in-depth understanding of GC performance has impeded performance improvement in big data applications. In this paper, we conduct the first comprehensive evaluation on three popular garbage collectors, i.e., Parallel, CMS, and G1, using four representative Spark applications. By thoroughly investigating the correlation between these big data applications’ memory usage patterns and the collectors’ GC patterns, we obtain many findings about GC inefficiencies. We further propose empirical guidelines for application developers, and insightful optimization strategies for designing big-data-friendly garbage collectors. more »

Award ID(s):: 1815619 1755659

PAR ID:: 10159018

Author(s) / Creator(s):: Xu, Lijie; Guo, Tian; Dou, Wensheng; Wang, Wei; Wei, Jun

Date Published:: 2019-01-01

Journal Name:: The 45th International Conference on Very Large Data Bases (VLDB'19)

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this