NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Compiling Loop-Based Nested Parallelism for Irregular Workloads

https://doi.org/10.1145/3620665.3640405

Su, Yian; Rainey, Mike; Wanninger, Nick; Dhiantravan, Nadharm; Liang, Jasper; Acar, Umut A; Dinda, Peter; Campanoni, Simone (April 2024, ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems)

Modern programming languages offer special syntax and semantics for logical fork-join parallelism in the form of parallel loops, allowing them to be nested, e.g., a parallel loop within another parallel loop. This expressiveness comes at a price, however: on modern multicore systems, realizing logical parallelism results in overheads due to the creation and management of parallel tasks, which can wipe out the benefits of parallelism. Today, we expect application programmers to cope with it by manually tuning and optimizing their code. Such tuning requires programmers to reason about architectural factors hidden behind layers of software abstractions, such as task scheduling and load balancing. Managing these factors is particularly challenging when workloads are irregular because their performance is input-sensitive. This paper presents HBC, the first compiler that translates C/C++ programs with high-level, fork-join constructs (e.g., OpenMP) to binaries capable of automatically controlling the cost of parallelism and dealing with irregular, input-sensitive workloads. The basis of our approach is Heartbeat Scheduling, a recent proposal for automatic granularity control, which is backed by formal guarantees on performance. HBC binaries outperform OpenMP binaries for workloads for which even entirely manual solutions struggle to find the right balance between parallelism and its costs.
more » « less
Full Text Available
Disentanglement with Futures, State, and Interaction

https://doi.org/10.1145/3632895

Arora, Jatin; Muller, Stefan K; Acar, Umut A (January 2024, Proceedings of the ACM on Programming Languages)

Recent work has proposed a memory property for parallel programs, called disentanglement, and showed that it is pervasive in a variety of programs, written in different languages, ranging from C/C++ to Parallel ML, and showed that it can be exploited to improve the performance of parallel functional programs. All existing work on disentanglement, however, considers the fork/join model for parallelism and does not apply to futures, the more powerful approach to parallelism. This is not surprising: fork/join parallel programs exhibit a reasonably strict dependency structure (e.g., series-parallel DAGs), which disentanglement exploits. In contrast, with futures, parallel computations become first-class values of the language, and thus can be created, and passed between functions calls or stored in memory, just like other ordinary values, resulting in complex dependency structures, especially in the presence of mutable state. For example, parallel programs with futures can have deadlocks, which is impossible with fork-join parallelism. In this paper, we are interested in the theoretical question of whether disentanglement may be extended beyond fork/join parallelism, and specifically to futures. We consider a functional language with futures, Input/Output (I/O), and mutable state (references) and show that a broad range of programs written in this language are disentangled. We start by formalizing disentanglement for futures and proving that purely functional programs written in this language are disentangled. We then generalize this result in three directions. First, we consider state (effects) and prove that stateful programs are disentangled if they are race free. Second, we show that race freedom is sufficient but not a necessary condition and non-deterministic programs, e.g. those that use atomic read-modify-operations and some non-deterministic combinators, may also be disentangled. Third, we prove that disentangled task-parallel programs written with futures are free of deadlocks, which arise due to interactions between state and the rich dependencies that can be expressed with futures. Taken together, these results show that disentanglement generalizes to parallel programs with futures and, thus, the benefits of disentanglement may go well beyond fork-join parallelism.
more » « less
Full Text Available
Efficient Parallel Self-Adjusting Computation

https://doi.org/10.1145/3409964.3461799

Anderson, Daniel; Blelloch, Guy E.; Baweja, Anubhav; Acar, Umut A. (July 2021, Proceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA))

Full Text Available
Task parallel assembly language for uncompromising parallelism

https://doi.org/10.1145/3453483.3460969

Rainey, Mike; Newton, Ryan R.; Hale, Kyle; Hardavellas, Nikos; Campanoni, Simone; Dinda, Peter; Acar, Umut A. (June 2021, PLDI 2021: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation)
null (Ed.)
Full Text Available
Provably space-efficient parallel functional programming

https://doi.org/10.1145/3434299

Arora, Jatin; Westrick, Sam; Acar, Umut A. (January 2021, Proceedings of the ACM on Programming Languages)
null (Ed.)
Because of its many desirable properties, such as its ability to control effects and thus potentially disastrous race conditions, functional programming offers a viable approach to programming modern multicore computers. Over the past decade several parallel functional languages, typically based on dialects of ML and Haskell, have been developed. These languages, however, have traditionally underperformed procedural languages (such as C and Java). The primary reason for this is their hunger for memory, which only grows with parallelism, causing traditional memory management techniques to buckle under increased demand for memory. Recent work opened a new angle of attack on this problem by identifying a memory property of determinacy-race-free parallel programs, called disentanglement, which limits the knowledge of concurrent computations about each other’s memory allocations. The work has showed some promise in delivering good time scalability. In this paper, we present provably space-efficient automatic memory management techniques for determinacy-race-free functional parallel programs, allowing both pure and imperative programs where memory may be destructively updated. We prove that for a program with sequential live memory of R * , any P -processor garbage-collected parallel run requires at most O ( R * · P ) memory. We also prove a work bound of O ( W + R * P ) for P -processor executions, accounting also for the cost of garbage collection. To achieve these results, we integrate thread scheduling with memory management. The idea is to coordinate memory allocation and garbage collection with thread scheduling decisions so that each processor can allocate memory without synchronization and independently collect a portion of memory by consulting a collection policy, which we formulate. The collection policy is fully distributed and does not require communicating with other processors. We show that the approach is practical by implementing it as an extension to the MPL compiler for Parallel ML. Our experimental results confirm our theoretical bounds and show that the techniques perform and scale well.
more » « less
Full Text Available
Program equivalence for assisted grading of functional programs

https://doi.org/10.1145/3428239

Clune, Joshua; Ramamurthy, Vijay; Martins, Ruben; Acar, Umut A. (November 2020, Proceedings of the ACM on Programming Languages)
null (Ed.)
Full Text Available
Parallel Batch-Dynamic Trees via Change Propagation

https://doi.org/10.4230/LIPIcs.ESA.2020.2

Umut Acar, Daniel Anderson (September 2020, European Symposium on Algorithms (ESA))

Full Text Available
Priority Scheduling for Interactive Applications

https://doi.org/10.1145/3350755.3400236

Singer, Kyle; Goldstein, Noah; Muller, Stefan K.; Agrawal, Kunal; Lee, I-Ting Angelina; Acar, Umut A. (July 2020, Proceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures)

Full Text Available
Responsive parallelism with futures and state

https://doi.org/10.1145/3385412.3386013

Muller, Stefan K.; Singer, Kyle; Goldstein, Noah; Acar, Umut A.; Agrawal, Kunal; Lee, I-Ting Angelina (June 2020, Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation)

Full Text Available
Disentanglement in nested-parallel programs

https://doi.org/10.1145/3371115

Westrick, S; Yadav, R; Fluet, M; Acar, U (January 2020, ACM)

Full Text Available

« Prev Next »

Search for: All records