Popular big data frameworks, ranging from Hadoop MapReduce to Spark, rely on garbage-collected languages, such as Java and Scala. Big data applications are especially sensitive to the effectiveness of garbage collection (i.e., GC), because they usually process a large volume of data objects that lead to heavy GC overhead. Lacking in-depth understanding of GC performance has impeded performance improvement in big data applications. In this paper, we conduct the first comprehensive evaluation on three popular garbage collectors, i.e., Parallel, CMS, and G1, using four representative Spark applications. By thoroughly investigating the correlation between these big data applications’ memory usage patterns and the collectors’ GC patterns, we obtain many findings about GC inefficiencies. We further propose empirical guidelines for application developers, and insightful optimization strategies for designing big-data-friendly garbage collectors.
more »
« less
Garbology: a study of how Java objects die
In this paper we present a study of how Java programs dispose of objects. Unlike prior work on object demographics and lifetime patterns, our goal is to precisely characterize the actions that cause objects to become unreachable. We use a recently-developed tracing tool, called Elephant Tracks, which can localize object deaths within a specific method and tell us the proximal cause. Our analysis centers around garbage clusters: groups of connected objects that become unreachable at precisely the same time due to a single program action. We classify these clusters using traditional features, such as size, allocation site, and lifetime, and using new ones, such as death site and cause of death. We then use this knowledge about garbage clusters in a new GC algorithm: the Cluster Aware Garbage Collection (CAGC) algorithm. We present results for a set of standard benchmarks including SPECJVM98, SPECjbb, and DaCapo. We identify several patterns that could inform the design of new collectors or tuning of existing systems. Most garbage clusters are small, suggesting that these programs almost always dispose of large structures in a piecemeal fashion. In addition, most clusters die in one of only a dozen or so places in the program. Furthermore, these death sites are much more stable and predictable than object lifetimes. Finally we show that the CAGC algorithm can in certain cases improve GC performance.
more »
« less
- Award ID(s):
- 1717373
- PAR ID:
- 10073830
- Date Published:
- Journal Name:
- Proceedings of the 2017 {ACM} {SIGPLAN} International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, Onward! 2017
- Page Range / eLocation ID:
- 168 to 179
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Garbage collection (GC) support for unmanaged languages can reduce programming burden in reasoning about liveness of dynamic objects. It also avoids temporal memory safety violations and memory leaks.SoundGC for weakly-typed languages such as C/C++, however, remains an unsolved problem. Current value-based GC solutions examine values of memory locations to discover the pointers, and the objects they point to. The approach is inherently unsound in the presence of arbitrary type casts and pointer manipulations, which are legal in C/C++. Such language features are regularly used, especially in low-level systems code. In this paper, we propose Dynamic Pointer Provenance Tracking to realize sound GC. We observe that pointers cannot be created out-of-thin-air, and they must have provenance to at least one valid allocation. Therefore, by tracking pointer provenance from the source (e.g., malloc) through both explicit data-flow and implicit control-flow, our GC has sound and precise information to compute the set of all reachable objects at any program state. We discuss several static analysis optimizations, that can be employed during compilation aided with profiling, to significantly reduce the overhead of dynamic provenance tracking from nearly 8Ă— to 16% for well-behaved programs that adhere to the C standards. Pointer provenance based sound GC invocation is also 13% faster and reclaims 6% more memory on average, compared to an unsound value-based GC.more » « less
-
This article introduces a novel system for deriving upper bounds on the heap-space requirements of functional programs with garbage collection. The space cost model is based on a perfect garbage collector that immediately deallocates memory cells when they become unreachable. Heap-space bounds are derived using type-based automatic amortized resource analysis (AARA), a template-based technique that efficiently reduces bound inference to linear programming. The first technical contribution of the work is a new operational cost semantics that models a perfect garbage collector. The second technical contribution is an extension of AARA to take into account automatic deallocation. A key observation is that deallocation of a perfect collector can be modeled with destructive pattern matching if data structures are used in a linear way. However, the analysis uses destructive pattern matching to accurately model deallocation even if data is shared. The soundness of the extended AARA with respect to the new cost semantics is proven in two parts via an intermediate linear cost semantics. The analysis and the cost semantics have been implemented as an extension to Resource Aware ML (RaML). An experimental evaluation shows that the system is able to derive tight symbolic heap-space bounds for common algorithms. Often the bounds are asymptotic improvements over bounds that RaML derives without taking into account garbage collection.more » « less
-
Chung-Kil Hur (Ed.)Dynamic object-oriented languages, such as Python, Ruby, and Javascript are widely used nowadays. A distinguishing feature of dynamic object-oriented languages is that objects, the fundamental runtime data representation, are highly dynamic, meaning that a single constructor may create objects with different types and objects can evolve freely after their construction. While such dynamism facilitates fast prototyping, it brings many challenges to program understanding. Many type systems have been developed to aid programming understanding, and they adopt various types and techniques to represent and track dynamic objects. However, although many types and techniques have been proposed, it is unclear which one suits real dynamic object usages best. Motivated by this situation, we perform an empirical study on 50 mature Python programs with a focus on object dynamism and object type models. We found that (1) object dynamism is highly prevalent in Python programs, (2) class-based types are not precise to handle dynamic behaviors, as they introduce type errors for 52% of the evaluated polymorphic attributes, (3) typestate-based types, although mostly used in static languages, matches the behaviors of dynamic objects faithfully, and (4) some well-designed but still lightweight techniques for objectbased types, such as argument type separation and recency abstraction can precisely characterize dynamic object behaviors. Those techniques are suitable for building precise but still concise object-based types.more » « less
-
null (Ed.)Resource-disaggregated architectures have risen in popularity for large datacenters. However, prior disaggregation systems are designed for native applications; in addition, all of them require applications to possess excellent locality to be efficiently executed. In contrast, programs written in managed languages are subject to periodic garbage collection (GC), which is a typical graph workload with poor locality. Although most datacenter applications are written in managed languages, current systems are far from delivering acceptable performance for these applications. This paper presents Semeru, a distributed JVM that can dramatically improve the performance of managed cloud applications in a memory-disaggregated environment. Its design possesses three major innovations: (1) a universal Java heap, which provides a unified abstraction of virtual memory across CPU and memory servers and allows any legacy program to run without modifications; (2) a distributed GC, which offloads object tracing to memory servers so that tracing is performed closer to data; and (3) a swap system in the OS kernel that works with the runtime to swap page data efficiently. An evaluation of Semeru on a set of widely-deployed systems shows very promising results.more » « less
An official website of the United States government

