NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

The AmpereOne A192-32X in Perspective: Benchmarking a New Standard

https://doi.org/10.1145/3703001.3724384

Carlson, David; Simakov, Nikolay; Ristow_Hadlich, Rodrigo; Curtis, Anthony; Martin, Joshua; Verma, Gaurav; Chheda, Smeet; Coskun, Firat; Gonzalez, Raul; Wood, Daniel; et al (February 2025, ACM)

This study presents a comprehensive benchmarking analysis of the Arm-based AmpereOne A192-32X CPU, a high-performance but low power processor designed for cloud-native workloads characterized by high core occupancy, imperfectly-vectorized or even pure scalar software, limited need for high floating-point performance, and, increasingly, AI inference. These traits also characterize much of academic research computing. Hence a thorough investigation of this novel CPU seeking to characterize its strengths and weaknesses on academic workloads, including traditional HPC codes for which it was not designed, will shed light on its relevance in a research setting. We report comparative analyses with contemporary CPUs (Intel Sapphire Rapids, AMD EPYC, NVIDIA Grace-Grace) and illustrate AmpereOne’s architectural advantages in handling parallel workloads and optimizing power consumption. The CPUs are compared in terms of performance and power consumption using a wide range of applications covering different workloads and disciplines.
more » « less
Free, publicly-accessible full text available February 19, 2026
Transcriptomic resources for Bagrada hilaris (Burmeister), a widespread invasive pest of Brassicales

https://doi.org/10.1371/journal.pone.0310186

Sparks, Michael E; Nelson, David R; Harrison, Robert L; Larson, Nicholas R; Kuhar, Daniel; Haber, Ariela I; Heraghty, Sam D; Rebholz, Zarley; Tholl, Dorothea; Grettenberger, Ian M; et al (December 2024, PLOS ONE)
Lou, Yonggen (Ed.)
The bagrada bug,Bagrada hilaris(Burmeister), is an emerging agricultural pest in the Americas, threatening agricultural production in the southwestern United States, Mexico and Chile, as well as in the Old World (including Africa, South Asia and, more recently, Mediterranean areas of Europe). Substantive transcriptomic sequence resources for this damaging species would be beneficial towards understanding its capacity for developing insecticide resistance, identifying viruses that may be present throughout its population and identifying genes differentially expressed across life stages that could be exploited for biomolecular pesticide formulations. This study establishesB.hilaristranscriptomic resources for eggs, 2^ndand 4^thlarval instars, as well as male and female adults. Three gene families involved in xenobiotic detoxification—glutathione S-transferases, carboxylesterases and cytochrome P450 monooxygenases—were phylogenetically characterized. These data were also qualitatively compared with previously published results for two closely related pentatomid species—the brown marmorated stink bug,Halyomorpha halys(Stål), and the harlequin bug,Murgantia histrionica(Hahn)—to elucidate shared enzymatic components of terpene-based sex pheromone biosynthetic pathways. Lastly, the sequence data were screened for potential RNAi- and virus-related content and for genes implicated in insect growth and development.
more » « less
Free, publicly-accessible full text available December 27, 2025
Benchmarking with Supernovae: A Performance Study of the FLASH Code

https://doi.org/10.1145/3626203.3670536

Martin, Joshua Ezekiel; Feldman, Catherine; Calder, Alan; Curtis, Tony; Siegmann, Eva; Carlson, David; Gonzalez, Raul; Wood, Daniel; Harrison, Robert; Coskun, Firat (July 2024, ACM)

Full Text Available
First Impressions of the Sapphire Rapids Processor with HBM for Scientific Workloads

https://doi.org/10.1007/s42979-024-02958-3

Siegmann, Eva; Harrison, Robert J; Carlson, David; Chheda, Smeet; Curtis, Anthony; Coskun, Firat; Gonzalez, Raul; Wood, Daniel; Simakov, Nikolay A (June 2024, SN Computer Science)

Abstract The landscape of high performance computing (HPC) has witnessed exponential growth in processor diversity, architectural complexity, and performance scalability. With an ever-increasing demand for faster and more efficient computing solutions to address an array of scientific, engineering, and societal challenges, the selection of processors for specific applications becomes paramount. Achieving optimal performance requires a deep understanding of how diverse processors interact with diverse workloads, making benchmarking a fundamental practice in the field of HPC. Here, we present preliminary results observed over such benchmarks and applications and a comparison of Intel Sapphire Rapids and Skylake-X, AMD Milan, and Fujitsu A64FX processors in terms of runtime performance, memory bandwidth utilization, and energy consumption. The examples focus specifically on the Sapphire Rapids processor with and without high-bandwidth memory (HBM). An additional case study reports the performance gains from using Intel’s Advanced Matrix Extensions (AMX) instructions, and how they along with HBM can be leveraged to accelerate AI workloads. These initial results aim to give a rough comparison of the processors rather than a detailed analysis and should prove timely and relevant for researchers who may be interested in using Sapphire Rapids for their scientific workloads.
more » « less
Full Text Available
First Impressions of the NVIDIA Grace CPU Superchip and NVIDIA Grace Hopper Superchip for Scientific Workloads

https://doi.org/10.1145/3636480.3637097

Simakov, Nikolay A.; Jones, Matthew D.; Furlani, Thomas R.; Siegmann, Eva; Harrison, Robert J. (January 2024, ACM)

The engineering samples of the NVIDIA Grace CPU Superchip and NVIDIA Grace Hopper Superchips were tested using different benchmarks and scientific applications. The benchmarks include HPCC and HPCG. The real application-based benchmark includes AI-Benchmark-Alpha (a TensorFlow benchmark), Gromacs, OpenFOAM, and ROMS. The performance was compared to multiple Intel, AMD, ARM CPUs and several x86 with NVIDIA GPU systems. A brief energy efficiency estimate was performed based on TDP values. We found that in HPCC benchmark tests, the per-core performance of Grace is similar to or faster than AMD Milan cores, and the high core count often allows NVIDIA Grace CPU Superchip to have per-node performance similar to Intel Sapphire Rapids with High Bandwidth Memory: slower in matrix multiplication (by 17%) and FFT (by 6%), faster in Linpack (by 9%)). In scientific applications, the NVIDIA Grace CPU Superchip performance is slower by 6% to 18% in Gromacs, faster by 7% in OpenFOAM, and right between HBM and DDR modes of Intel Sapphire Rapids in ROMS. The combined CPU-GPU performance in Gromacs is significantly faster (by 20% to 117% faster) than any tested x86-NVIDIA GPU system. Overall, the new NVIDIA Grace Hopper Superchip and NVIDIA Grace CPU Superchip Superchip are high-performance and most likely energy-efficient solutions for HPC centers.
more » « less
From Molecular Dynamics to Oceanography - Ookami Graduate Students Porting and Tuning Science Codes for A64FX

https://doi.org/10.1145/3569951.3593608

Kaushik, Kedarsh; Wang, Yuzhang; Ma, Youwei; Carlson, David; Curtis, Tony; Harrison, Robert; Siegmann, Eva (July 2023, PEARC '23: Practice and Experience in Advanced Research Computing)

Full Text Available
A Further Study of Linux Kernel Hugepages on A64FX with FLASH, an Astrophysical Simulation Code

https://doi.org/10.1145/3569951.3597583

Feldman, Catherine; Chheda, Smeet; Calder, Alan; Siegmann, Eva; Dey, John; Curtis, Tony; Harrison, Robert (July 2023, PEARC '23: Practice and Experience in Advanced Research Computing)

Full Text Available
Are we ready for broader adoption of ARM in the HPC community: Performance and Energy Efficiency Analysis of Benchmarks and Applications Executed on High-End ARM Systems

https://doi.org/10.1145/3581576.3581618

Simakov, Nikolay A.; Deleon, Robert L.; White, Joseph P.; Jones, Matthew D.; Furlani, Thomas R.; Siegmann, Eva; Harrison, Robert J. (February 2023, Proceedings of the HPC Asia 2023 Workshops (HPC Asia '23 Workshops))

Full Text Available
Patterns in Foliar Isotopic Nitrogen, Percent Nitrogen, and Site Index for Managed Forest Systems in the United States

https://doi.org/10.3390/f13101694

Buntrock, Laura; Thomas, Valerie A.; Strahm, Brian D.; Fox, Tom; Harrison, Robert; Himes, Austin; Littke, Kim (October 2022, Forests)

Patterns in foliar nitrogen (N) stable isotope ratios (δ15N) have been shown to reveal trends in terrestrial N cycles, including the identification of ecosystems where N deficiencies limit forest ecosystem productivity. However, there is a gap in our understanding of within-species variation and species-level response to environmental gradients or forest management. Our objective is to examine the relationship between site index, foliar %N, foliar δ15N and spectral reflectance for managed Douglas-fir (Pseudotsuga menziesii) and loblolly pine (Pinus taeda) plantations across their geographic ranges in the Pacific Northwest and the southeastern United States, respectively. Foliage was measured at 28 sites for reflectance using a handheld spectroradiometer, and further analyzed for δ15N and N concentration. Unlike the prior work for grasslands and shrubland species, our results show that foliar δ15N and foliar %N are not well correlated for these tree species. However, multiple linear regression models suggest a strong predictive ability of spectroscopy data to quantify foliar δ15N, with some models explaining more than 65% of the variance in the δ15N. Additionally, moderate to strong explanations of variance were found between site index and foliar δ15N (R2 = 0.49) and reflectance and site index (R2 = 0.84) in the Douglas-fir data set. The development of relationships between foliar spectral reflectance, δ15N and measures of site productivity provides the first step toward mapping canopy δ15N for these managed forests with remote sensing.
more » « less
Full Text Available
Generalized Flow-Graph Programming Using Template Task-Graphs: Initial Implementation and Assessment

https://doi.org/10.1109/IPDPS53621.2022.00086

Schuchart, Joseph; Nookala, Poornima; Javanmard, Mohammad Mahdi; Herault, Thomas; Valeev, Edward F.; Bosilca, George; Harrison, Robert J. (May 2022, 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW))

We present and evaluate TTG, a novel programming model and its C++ implementation that by marrying the ideas of control and data flowgraph programming supports compact specification and efficient distributed execution of dynamic and irregular applications. Programming interfaces that support task-based execution often only support shared memory parallel environments; a few support distributed memory environments, either by discovering the entire DAG of tasks on all processes, or by introducing explicit communications. The first approach limits scalability, while the second increases the complexity of programming. We demonstrate how TTG can address these issues without sacrificing scalability or programmability by providing higher-level abstractions than conventionally provided by task-centric programming systems, without impeding the ability of these runtimes to manage task creation and execution as well as data and resource management efficiently. TTG supports distributed memory execution over 2 different task runtimes, PaRSEC and MADNESS. Performance of four paradigmatic applications (in graph analytics, dense and block-sparse linear algebra, and numerical integrodifferential calculus) with various degrees of irregularity implemented in TTG is illustrated on large distributed-memory platforms and compared to the state-of-the-art implementations.
more » « less
Full Text Available

« Prev Next »

Search for: All records