Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available March 30, 2026
-
Free, publicly-accessible full text available March 1, 2026
-
Free, publicly-accessible full text available February 1, 2026
-
Subgraph Isomorphism involves using a small graph as a pattern to identify within a larger graph a set of vertices that have edges that match, and is becoming of increasing importance in many application areas. Such problems exhibit the potential for very significant fine-grain parallelism, with individual threads having short lifetimes while touching potentially “distant” memory objects in very unpredictable and irregular fashion. This is difficult for conventional distributed memory systems to achieve efficiently, but an alternative that combines cheap multi-threading with threads that can migrate freely through a large memory is a more natural fit. This paper demonstrates the potential of such an architecture by comparing its execution characteristics for a large graph to that of several conventional parallel implementations on modern but conventional architectures. The gains exhibited by the migrating threads are significant.more » « less
-
CUDA is designed specifically for NVIDIA GPUs and is not compatible with non-NVIDIA devices. Enabling CUDA execution on alternative backends could greatly benefit the hardware community by fostering a more diverse software ecosystem. To address the need for portability, our objective is to develop a framework that meets key requirements, such as extensive coverage, comprehensive end-to-end support, superior performance, and hardware scalability. Existing solutions that translate CUDA source code into other high-level languages, however, fall short of these goals. In contrast to these source-to-source approaches, we present a novel framework, CuPBoP , which treats CUDA as a portable language in its own right. Compared to two commercial source-to-source solutions, CuPBoP offers a broader coverage and superior performance for the CUDA-to-CPU migration. Additionally, we evaluate the performance of CuPBoP against manually optimized CPU programs, highlighting the differences between CPU programs derived from CUDA and those that are manually optimized. Furthermore, we demonstrate the hardware scalability of CuPBoP by showcasing its successful migration of CUDA to AMD GPUs. To promote further research in this field, we have released CuPBoP as an open-source resource.more » « less
An official website of the United States government

Full Text Available