Abstract We introduce BPMF (backprojection and matched filtering)—a complete and fully automated workflow designed for earthquake detection and location, and distributed in a Python package. This workflow enables the creation of comprehensive earthquake catalogs with low magnitudes of completeness using no or little prior knowledge of the study region. BPMF uses the seismic wavefield backprojection method to construct an initial earthquake catalog that is then densified with matched filtering. BPMF integrates recent machine learning tools to complement physics-based techniques, and improve the detection and location of earthquakes. In particular, BPMF offers a flexible framework in which machine learning detectors and backprojection can be harmoniously combined, effectively transforming single-station detectors into multistation detectors. The modularity of BPMF grants users the ability to control the contribution of machine learning tools within the workflow. The computation-intensive tasks (backprojection and matched filtering) are executed with C and CUDA-C routines wrapped in Python code. This leveraging of low-level, fast programming languages and graphic processing unit acceleration enables BPMF to efficiently handle large datasets. Here, we first summarize the methodology and describe the application programming interface. We then illustrate BPMF’s capabilities to characterize microseismicity with a 10 yr long application in the Ridgecrest, California area. Finally, we discuss the workflow’s runtime scaling with numerical resources and its versatility across various tectonic environments and different problems.
more »
« less
Asynchronous Execution of Python Code on Task-Based Runtime Systems
Despite advancements in the areas of parallel and distributed computing, the complexity of programming on High Performance Computing (HPC) resources has deterred many domain experts, especially in the areas of machine learning and artificial intelligence (AI), from utilizing performance benefits of such systems. Researchers and scientists favor high-productivity languages to avoid the inconvenience of programming in low-level languages and costs of acquiring the necessary skills required for programming at this level. In recent years, Python, with the support of linear algebra libraries like NumPy, has gained popularity despite facing limitations which prevent this code from distributed runs. Here we present a solution which maintains both high level programming abstractions as well as parallel and distributed efficiency. Phylanx, is an asynchronous array processing toolkit which transforms Python and NumPy operations into code which can be executed in parallel on HPC resources by mapping Python and NumPy functions and variables into a dependency tree executed by HPX, a general purpose, parallel, task-based runtime system written in C++. Phylanx additionally provides introspection and visualization capabilities for debugging and performance analysis. We have tested the foundations of our approach by comparing our implementation of widely used machine learning algorithms to accepted NumPy standards.
more »
« less
- Award ID(s):
- 1737785
- PAR ID:
- 10109768
- Date Published:
- Journal Name:
- 2018 IEEE/ACM 4th International Workshop on Extreme Scale Programming Models and Middleware (ESPM2)
- Page Range / eLocation ID:
- 37 to 45
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Operational ocean forecasting systems (OOFSs) are complex engines that must execute ocean models with high performance to provide timely products and datasets. Significant computational resources are then needed to run high-fidelity models, and, historically, the technological evolution of microprocessors has constrained data-parallel scientific computation. Today, graphics processing units (GPUs) offer a rapidly growing and valuable source of computing power rivaling the traditional CPU-based machines: the exploitation of thousands of threads can significantly accelerate the execution of many models, ranging from traditional HPC workloads of finite difference, finite volume, and finite element modelling through to the training of deep neural networks used in machine learning (ML) and artificial intelligence. Despite the advantages, GPU usage in ocean forecasting is still limited due to the legacy of CPU-based model implementations and the intrinsic complexity of porting core models to GPU architectures. This review explores the potential use of GPU in ocean forecasting and how the computational characteristics of ocean models can influence the suitability of GPU architectures for the execution of the overall value chain: it discusses the current approaches to code (and performance) portability, from CPU to GPU, including tools that perform code transformation, easing the adaptation of Fortran code for GPU execution (like PSyclone), the direct use of OpenACC directives (like ICON-O), the adoption of specific frameworks that facilitate the management of parallel execution across different architectures, and the use of new programming languages and paradigms.more » « less
-
Poole, Steve; Hernandez, Oscar; Baker, Matthew; Curtis, Tony (Ed.)SHMEM-ML is a domain specific library for distributed array computations and machine learning model training & inference. Like other projects at the intersection of machine learning and HPC (e.g. dask, Arkouda, Legate Numpy), SHMEM-ML aims to leverage the performance of the HPC software stack to accelerate machine learning workflows. However, it differs in a number of ways. First, SHMEM-ML targets the full machine learning workflow, not just model training. It supports a general purpose nd-array abstraction commonly used in Python machine learning applications, and efficiently distributes transformation and manipulation of this ndarray across the full system. Second, SHMEM-ML uses OpenSHMEM as its underlying communication layer, enabling high performance networking across hundreds or thousands of distributed processes. While most past work in high performance machine learning has leveraged HPC message passing communication models as a way to efficiently exchange model gradient updates, SHMEM-ML’s focus on the full machine learning lifecycle means that a more flexible and adaptable communication model is needed to support both fine and coarse grain communication. Third, SHMEM-ML works to interoperate with the broader Python machine learning software ecosystem. While some frameworks aim to rebuild that ecosystem from scratch on top of the HPC software stack, SHMEM-ML is built on top of Apache Arrow, an in-memory standard for data formatting and data exchange between libraries. This enables SHMEM-ML to share data with other libraries without creating copies of data. This paper describes the design, implementation, and evaluation of SHMEM-ML – demonstrating a general purpose system for data transformation and manipulation while achieving up to a 38× speedup in distributed training performance relative to the industry standard Horovod framework without a regression in model metrics.more » « less
-
null (Ed.)Python has become one of the most used and taught languages nowadays. Its expressiveness, cross-compatibility and ease of use have made it popular in areas as diverse as finance, bioinformatics or machine learning. However, Python programs are often significantly slower to execute than an equivalent native C implementation, especially for computation-intensive numerical kernels. This work presents PolyBench/Python, implementing the 30 kernels in PolyBench/C, one of the standard benchmark suites for polyhedral optimization, in Python. In addition to the benchmark kernels, a functional wrapper including mechanisms for performance measurement, testing, and execution configuration has been developed. The framework includes support for different ways to translate C-array codes into Python, offering insight into the tradeoffs of Python lists and NumPy arrays. The benchmark performance is thoroughly evaluated on different Python interpreters, and compared against its PolyBench/C counterpart to highlight the profitability (or lack thereof) of using Python for regular numerical codes.more » « less
-
null (Ed.)We describe JetLag, a Python-based environment that provides access to a distributed, interactive, asynchronous many-task (AMT) computing framework called Phylanx. This environment encompasses the entire computing process, from a Jupyter front-end for managing code and results to the collection and visualization of performance data.We use a Python decorator to access the abstract syntax tree of Python functions and transpile them into a set of C++ data structures which are then executed by the HPX runtime. The environment includes services for sending functions and their arguments to run as jobs on remote resources.A set of Docker and Singularity containers are used to simplify the setup of the JetLag environment. The JetLag system is suitable for a variety of array computational tasks, including machine learning and exploratory data analysis.more » « less
An official website of the United States government

