NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A Probabilistic Approach To Selecting Build Configurations in Package Managers

https://doi.org/10.1109/SC41406.2024.00090

Nichols, Daniel; Menon, Harshitha; Gamblin, Todd; Bhatele, Abhinav (November 2024, Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis)

Full Text Available
xAMM: “Attention” to Details Improves Cross-Platform Prediction Accuracy

https://doi.org/10.1109/CCGRID64434.2025.00067

Dhakal, Aakash Raj; Islam, Tanzima Z; Dey, Arunavo; Nichols, Daniel; Bhatele, Abhinav; Patki, Tapasya; Scogland, Tom; Yeom, Jae-Seung (May 2025, IEEE)

Free, publicly-accessible full text available May 19, 2026
Can Large Language Models Write Parallel Code?

https://doi.org/10.1145/3625549.3658689

Nichols, Daniel; Davis, Joshua H; Xie, Zhaojun; Rajaram, Arjun; Bhatele, Abhinav (June 2024, Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing. ACM.)

Full Text Available
HPC-Coder: Modeling Parallel Programs using Large Language Models

https://doi.org/10.23919/ISC.2024.10528929

Nichols, Daniel; Marathe, Aniruddha; Menon, Harshitha; Gamblin, Todd; Bhatele, Abhinav (May 2024, Proceedings of the ISC High Performance Conference. IEEE.)

Full Text Available
Predicting Cross-Architecture Performance of Parallel Programs

https://doi.org/10.1109/IPDPS57955.2024.00057

Nichols, Daniel; Movsesyan, Alexander; Yeom, Jae-Seung; Sarkar, Abhik; Milroy, Daniel; Patki, Tapasya; Bhatele, Abhinav (May 2024, Proceedings of the IEEE International Parallel & Distributed Processing Symposium. IEEE Computer Society.)

Full Text Available
Resource Utilization Aware Job Scheduling to Mitigate Performance Variability

https://doi.org/10.1109/IPDPS53621.2022.00040

Nichols, Daniel; Marathe, Aniruddha; Shoga, Kathleen; Gamblin, Todd; Bhatele, Abhinav (May 2022, Proceedings of the IEEE International Parallel & Distributed Processing Symposium. IEEE Computer Society.)

Full Text Available
MagmaDNN: Towards High-Performance Data Analytics and Machine Learning for Data-Driven Scientific Computing

Nichols, Daniel; Tomov, Nathalie; Betancourt, Frank; Tomov, Stanimire; Wong, Kwai; Dongarra, Jack (December 2019, Proceedings International Conference on High Performance Computing)

In this paper, we present work towards the development of a new data analytics and machine learning (ML) framework, called MagmaDNN. Our main goal is to provide scalable, high-performance data analytics and ML solutions for scientific applications running on current and upcoming heterogeneous many-core GPU-accelerated architectures. To this end, since many of the functionalities needed are based on standard linear algebra (LA) routines, we designed MagmaDNN to derive its performance power from the MAGMA library. The close integration provides the fundamental (scalable high-performance) LA routines available in MAGMA as a backend to MagmaDNN. We present some design issues for performance and scalability that are specific to ML using Deep Neural Networks (DNN), as well as the MagmaDNN designs towards overcoming them. In particular, MagmaDNN uses well established HPC techniques from the area of dense LA, including task-based parallelization, DAG representations, scheduling, mixed-precision algorithms, asynchronous solvers, and autotuned hyperparameter optimization. We illustrate these techniques and their incorporation and use to outperform other frameworks, currently available.
more » « less
Full Text Available
MagmaDNN: Accelerated Deep Learning Using MAGMA

https://doi.org/10.1145/3332186.3333047

Nichols, Daniel; Wong, Kwai; Tomov, Stan; Ng, Lucien; Chen, Sihan; Gessinger, Alex (January 2019, PEARC19)

MagmaDNN [17] is a deep learning framework driven using the highly optimized MAGMA dense linear algebra package. The library offers comparable performance to other popular frameworks, such as TensorFlow, PyTorch, and Theano. C++ is used to implement the framework providing fast memory operations, direct cuda access, and compile time errors. Common neural network layers such as Fully Connected, Convolutional, Pooling, Flatten, and Dropout are included. Hyperparameter tuning is performed with a parallel grid search engine. MagmaDNN uses several techniques to accelerate network training. For instance, convolutions are performed using the Winograd algorithm and FFTs. Other techniques include MagmaDNNs custom memory manager, which is used to reduce expensive memory transfers, and accelerated training by distributing batches across GPU nodes. This paper provides an overview of the MagmaDNN framework and how it leverages the MAGMA library to attain speed increases. This paper also addresses how deep networks are accelerated by training in parallel and further challenges with parallelization.
more » « less
Full Text Available
openDIEL: A Parallel Workflow Engine and Data Analytics Framework

https://doi.org/10.1145/3332186.3333051

Betancourt, Frank; Wong, Kwai; Asemota, Efosa; Marshall, Quindell; Nichols, Daniel; Tomov, Stanimire (January 2019, PEARC19)

openDIEL is a workflow engine that aims to give researchers and users of HPC an efficient way to coordinate, organize, and interconnect many disparate modules of computation in order to effectively utilize and allocate HPC resources [13]. A GUI has been developed to aid in creating workflows, and allows for the specification of data science jobs, including specification neural network architectures, data processing, and hyperparameter tuning. Existing machine learning tools can be readily used in the openDIEL, allowing for easy experimentation with various models and approaches.
more » « less
Full Text Available

Search for: All records