The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, November 15 until 2:00 AM ET on Saturday, November 16 due to maintenance. We apologize for the inconvenience.
Title: Send to me first: Priority in synchronous message-passing
Abstract In this paper, we introduce a tiered-priority scheme for a synchronous message-passing language with support for selective communication and first-class communication protocols. Crucially, our scheme allows higher priority threads to communicate with lower priority threads, providing the ability to express programs that would be rejected by classic priority mechanisms that disallow any (potentially) blocking interactions between threads of differing priorities. We formalize our scheme in a novel semantic framework featuring a collection of actions to represent possible communications. Utilizing our formalism, we prove several important and desirable properties of our priority scheme. We also provide a prototype implementation of our tiered-priority scheme capable of expressing Concurrent ML and built in the MLton SML compiler and runtime. We evaluate the viability of our implementation through three case studies: a prioritized buyer-seller protocol and predictable shutdown mechanisms in the Swerve web server and eXene windowing toolkit. Our experiments show that priority can be easily added to existing CML programs without degrading performance. Our system exhibits negligible overheads on more modest workloads. more »« less
In this paper we introduce a tiered-priority mechanism for a synchronous message-passing language with support for selective communication and first-class communication protocols. Crucially our mechanism allows higher priority threads to communicate with lower priority threads, providing the ability to express programs that would be rejected by classic priority mechanisms that disallow any (potentially) blocking interactions between threads of differing priorities. We provide a prototype implementation of our tiered-priority mechanism capable of expressing Concurrent ML and built in the MLton SML compiler and runtime. We evaluate the viability of our implementation by implementing a safe and predictable shutdown mechanisms in the Swerve webserver and eXene windowing toolkit. Our experiments show that priority can be easily added to existing CML programs without degrading performance. Our system exhibits negligible overheads on more modest workloads.
Muller, Stefan K.; Singer, Kyle; Keeney, Devyn Terra; Neth, Andrew; Agrawal, Kunal; Lee, I-Ting Angelina; Acar, Umut A.(
, Proceedings of the ACM on Programming Languages)
Many concurrent programs assign priorities to threads to improve responsiveness. When used in conjunction with synchronization mechanisms such as mutexes and condition variables, however, priorities can lead to priority inversions, in which high-priority threads are delayed by low-priority ones. Priority inversions in the use of mutexes are easily handled using dynamic techniques such as priority inheritance, but priority inversions in the use of condition variables are not well-studied and dynamic techniques are not suitable. In this work, we use a combination of static and dynamic techniques to prevent priority inversion in code that uses mutexes and condition variables. A type system ensures that condition variables are used safely, even while dynamic techniques change thread priorities at runtime to eliminate priority inversions in the use of mutexes. We prove the soundness of our system, using a model of priority inversions based on cost models for parallel programs. To show that the type system is practical to implement, we encode it within the type systems of Rust and C++, and show that the restrictions are not overly burdensome by writing sizeable case studies using these encodings, including porting the Memcached object server to use our C++ implementation.
Utterback, Robert; Agrawal, Kunal; Lee, I-Ting Angelina; Kulkarni, Milind(
, Proceedings of the 22Nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming)
Record-and-replay systems are useful tools for debugging non-deterministic parallel programs by first recording an execution and then replaying that execution to produce the same access pattern. Existing record-and-replay systems generally target thread-based execution models, and record the behaviors and interleavings of individual threads. Dynamic multithreaded languages and libraries, such as the Cilk family, OpenMP, TBB, etc., do not have a notion of threads. Instead, these languages provide a processor-oblivious model of programming, where programs expose task-parallelism using high-level constructs such as spawn/sync without regard to the number of threads/cores available to run the program. Thread-based record-and-replay would violate the processor-oblivious nature of these programs, as they incorporate the number of threads into the recorded information, constraining the replayed execution to the same number of threads.
In this paper, we present a processor-oblivious record-and-replay scheme for such languages where record and replay can use different number of processors and both are scheduled using work stealing. We provide theoretical guarantees for our record and replay scheme --- namely that record is optimal for programs with one lock and replay is near-optimal for all cases. In addition, we implemented this scheme in the Cilk Plus runtime system and our evaluation indicates that processor-obliviousness does not cause substantial overheads.
Worley, Andrew; Prema Soundararajan, Prema; Schafer, Derek; Bangalore, Purushotham; Grant, Ryan; Dosanjh, Matthew; Skjellum, Anthony; Ghafoor, Sheikh(
, ICPP Workshops '21: 50th International Conference on Parallel Processing Workshop)
null
(Ed.)
The Message Passing Interface (MPI) has been the dominant message
passing solution for scientific computing for decades. MPI point-to-point
communications are highly efficient mechanisms for process-to-
process communication. However, MPI performance is slowed
by concurrency protections in the MPI library when processes utilize
multiple threads. MPI’s current thread-level interface imposes
these overheads throughout the library when thread safety is needed.
While much work has been done to reduce multithreading overheads
in MPI, a solution is needed that reduces the number of messages
exchanged in a threaded environment.
Partitioned communication is included in the MPI 4.0 standard
as an alternative that addresses the challenges of multithreaded
communication in MPI today. Partitioned communication reduces
overall message volume by creating a buffer-sharing mechanism
between threads such that they can indicate when portions of a
communication buffer are available to be sent. Separation of the
control and data planes in MPI is enabled by allowing persistent
initialization and single occurrence message buffer matching from
the indication that the data is ready to be sent. This enables the usage
commands (destination, size, etc.) can be set up prior to data buffer
readiness with readiness triggered by a simple doorbell/counter later.
This approach is useful for future development of MPI operations
in environments where traditional networking commands can have
performance challenges, like accelerators (GPUs, FPGAs).
In this paper,we detail the design and implementation of a layered
library (built on top of MPI-3.1) and an integrated Open MPI solution
that supports the new, MPI-4.0 partitioned communication feature
set. The library will enable applications to use currently released MPI
implementations and older legacy libraries to provide partitioned
communication support while also enabling further exploration of
this new communication model in new applications and use cases.
We will compare the designs of the library and native Open MPI
support, provide performance results and comparisons between the
two approaches, and lessons learned from the implementation of
partitioned communication in both library and native forms.
We find that the native implementation and library have similar
performance with a percentage difference under 0.94% in microbenchmarks
and performance within 5% for a partitioned communication
enabled proxy application.
Mack, Elizabeth A.; Wrase, Sarah; Dahme, Joanne; Crosby, Susan M.; Davis, Martha; Wright, Melody; Muhammad, Ravonne(
, Journal of the American Water Resources Association)
ABSTRACT: The ability to pay for water and wastewater services is a growing issue in the developed world. To this point in time, utilities have helped customers grappling with affordability issues using different types of customer assistance programs (CAPs). Income-based billing approaches differ from CAPs in that bills are structured so as to be affordable for customers at the outset. Recently, the City of Philadelphia implemented an innovative
program to work towards resolving the affordability problem in their city using income-based billing. This tiered assistance program or TAP structures bills for water, wastewater, and stormwater services to program enrollees’ income. Given the innovative nature of the program, this paper describes the rollout of TAP and assesses the impact of the program on customers and utility revenues. The paper closes with a critical assessment of TAP and considerations for utilities evaluating the implementation of similar programs.
CHUANG, CHENG-EN, IRACI, GRANT, and ZIAREK, LUKASZ. Send to me first: Priority in synchronous message-passing. Retrieved from https://par.nsf.gov/biblio/10396542. Journal of Functional Programming 32. Web. doi:10.1017/S0956796822000119.
CHUANG, CHENG-EN, IRACI, GRANT, & ZIAREK, LUKASZ. Send to me first: Priority in synchronous message-passing. Journal of Functional Programming, 32 (). Retrieved from https://par.nsf.gov/biblio/10396542. https://doi.org/10.1017/S0956796822000119
@article{osti_10396542,
place = {Country unknown/Code not available},
title = {Send to me first: Priority in synchronous message-passing},
url = {https://par.nsf.gov/biblio/10396542},
DOI = {10.1017/S0956796822000119},
abstractNote = {Abstract In this paper, we introduce a tiered-priority scheme for a synchronous message-passing language with support for selective communication and first-class communication protocols. Crucially, our scheme allows higher priority threads to communicate with lower priority threads, providing the ability to express programs that would be rejected by classic priority mechanisms that disallow any (potentially) blocking interactions between threads of differing priorities. We formalize our scheme in a novel semantic framework featuring a collection of actions to represent possible communications. Utilizing our formalism, we prove several important and desirable properties of our priority scheme. We also provide a prototype implementation of our tiered-priority scheme capable of expressing Concurrent ML and built in the MLton SML compiler and runtime. We evaluate the viability of our implementation through three case studies: a prioritized buyer-seller protocol and predictable shutdown mechanisms in the Swerve web server and eXene windowing toolkit. Our experiments show that priority can be easily added to existing CML programs without degrading performance. Our system exhibits negligible overheads on more modest workloads.},
journal = {Journal of Functional Programming},
volume = {32},
author = {CHUANG, CHENG-EN and IRACI, GRANT and ZIAREK, LUKASZ},
}
Warning: Leaving National Science Foundation Website
You are now leaving the National Science Foundation website to go to a non-government website.
Website:
NSF takes no responsibility for and exercises no control over the views expressed or the accuracy of
the information contained on this site. Also be aware that NSF's privacy policy does not apply to this site.