skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM to 12:00 PM ET on Tuesday, March 25 due to maintenance. We apologize for the inconvenience.


Title: MPI.jl: Julia bindings for the Message Passing Interface
MPI.jl is a Julia package for using the Message Passing Interface (MPI), a standardized and widely-supported communication interface for distributed computing, with multiple open source and proprietary implementations. It roughly follows the C MPI interface, with some additional conveniences afforded by the Julia language such as automatic handling of buffer lengths and datatypes.  more » « less
Award ID(s):
1835860
PAR ID:
10378090
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
JuliaCon Proceedings
Volume:
1
Issue:
1
ISSN:
2642-4029
Page Range / eLocation ID:
68
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper details the implementation and usage of software-based performance counters to understand the performance of a particular implementation of the MPI standard, Open MPI. Such counters can expose intrinsic features of the software stack that are not available otherwise in a generic and portable way. The PMPI-interface is useful for instrumenting MPI applications at a user level, however it is insufficient for providing meaningful internal MPI performance details. While the Peruse interface provides more detailed information on state changes within Open MPI, it has not seen widespread adoption. We introduce a simple low-level approach that instruments the Open MPI code at key locations to provide fine-grained MPI performance metrics. We evaluate the overhead associated with adding these counters to Open MPI as well as their use in determining bottlenecks and areas for improvement both in user code and the MPI implementation itself. 
    more » « less
  2. Summary The Exascale Computing Project (ECP) focuses on the development of future exascale‐capable applications. Most ECP applications use the message passing interface (MPI) as their parallel programming model with mini‐apps serving as proxies. This paper explores the explicit usage of MPI in such ECP proxy applications. We empirically analyze 14 proxy applications from the ECP Proxy Apps Suite. We use the MPI profiling interface (PMPI) to collect MPI usage patterns in ECP proxy apps. Our analysis shows that a small subset of features from MPI is commonly used in the proxies of exascale‐capable applications, even when they reference third‐party libraries. This study is intended to provide a better understanding of the use of MPI in current exascale applications. The findings can help focus software investments made for exascale systems in the MPI middleware including optimization, fault‐tolerance, tuning, and hardware‐offload. 
    more » « less
  3. The MPI standard has long included one-sided communication abstractions through the MPI Remote Memory Access (RMA) interface. Unfortunately, the MPI RMA chapter in the 4.0 version of the MPI standard still contains both well-known and lesser known short-comings for both implementations and users, which lead to potentially non-optimal usage patterns. In this paper, we identify a set of issues and propose ways for applications to better express anticipated usage of RMA routines, allowing the MPI implementation to better adapt to the application's needs. In order to increase the flexibility of the RMA interface, we add the capability to duplicate windows, allowing access to the same resources encapsulated by a window using different configurations. In the same vein, we introduce the concept of MPI memory handles, meant to provide life-time guarantees on memory attached to dynamic windows, removing the overhead currently present in using dynamically exposed memory. We will show that our extensions provide improved accumulate latencies, reduced overheads for multi-threaded flushes, and allow for zero overhead dynamic memory window usage. 
    more » « less
  4. null (Ed.)
    The Message Passing Interface (MPI) has been the dominant message passing solution for scientific computing for decades. MPI point-to-point communications are highly efficient mechanisms for process-to- process communication. However, MPI performance is slowed by concurrency protections in the MPI library when processes utilize multiple threads. MPI’s current thread-level interface imposes these overheads throughout the library when thread safety is needed. While much work has been done to reduce multithreading overheads in MPI, a solution is needed that reduces the number of messages exchanged in a threaded environment. Partitioned communication is included in the MPI 4.0 standard as an alternative that addresses the challenges of multithreaded communication in MPI today. Partitioned communication reduces overall message volume by creating a buffer-sharing mechanism between threads such that they can indicate when portions of a communication buffer are available to be sent. Separation of the control and data planes in MPI is enabled by allowing persistent initialization and single occurrence message buffer matching from the indication that the data is ready to be sent. This enables the usage commands (destination, size, etc.) can be set up prior to data buffer readiness with readiness triggered by a simple doorbell/counter later. This approach is useful for future development of MPI operations in environments where traditional networking commands can have performance challenges, like accelerators (GPUs, FPGAs). In this paper,we detail the design and implementation of a layered library (built on top of MPI-3.1) and an integrated Open MPI solution that supports the new, MPI-4.0 partitioned communication feature set. The library will enable applications to use currently released MPI implementations and older legacy libraries to provide partitioned communication support while also enabling further exploration of this new communication model in new applications and use cases. We will compare the designs of the library and native Open MPI support, provide performance results and comparisons between the two approaches, and lessons learned from the implementation of partitioned communication in both library and native forms. We find that the native implementation and library have similar performance with a percentage difference under 0.94% in microbenchmarks and performance within 5% for a partitioned communication enabled proxy application. 
    more » « less
  5. null (Ed.)
    The 2019 ABET computer science criteria requires that all computing students learn parallel and distributed computing (PDC) as undergraduates, and CS2013 recommends at least fifteen hours of PDC in the undergraduate curriculum. Consequently, many educators look for easy ways to integrate PDC into courses at their institutions. This hands-on workshop introduces Message Passing Interface (MPI) basics in C/C++ and Python using clusters of Raspberry Pis. The Message Passing Interface (MPI) is a multi-language, platform independent, industry-standard library for parallel and distributed computing. Raspberry Pis are an inexpensive and engaging hardware platform for studying PDC as early as the first course. Participants will experience how to teach distributed computing essentials with MPI by means of reusable, effective "parallel patterns", including single program multiple data (SPMD) execution, send-receive message passing, the master-worker pattern, parallel loop patterns, and other common patterns, plus longer "exemplar" programs that use MPI to solve significant applied problems. The workshop includes: (i) personal experience with the Raspberry Pi (clusters provided for workshop use); (ii) assembly of Beowulf clusters of Raspberry Pis quickly in the classroom; (iii) self-paced hands-on experimentation with the working MPI programs; and (iv) a discussion of how these may be used to achieve the goals of CS2013 and ABET. No prior experience with MPI, PDC, or the Raspberry Pi is expected. All materials from this workshop will be freely available from CSinParallel.org; participants should bring a laptop to access these materials. 
    more » « less