NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

ConTraPh: Contrastive Learning for Parallelization and Performance Optimization

Mahmud, Quazi Ishtiaque; TehraniJamsaz, Ali; Ahmed, Nesreen K; Willke, Theodore L; Jannesari, Ali (August 2025, ACM International Conference on Supercomputing)

Free, publicly-accessible full text available August 10, 2026
AutoParLLM: GNN-guided Context Generation for Zero-Shot Code Parallelization using LLMs

https://doi.org/10.18653/v1/2025.naacl-long.593

Mahmud, Quazi Ishtiaque; TehraniJamsaz, Ali; Phan, Hung D; Chen, Le; Capotă, Mihai; Willke, Theodore L; Ahmed, Nesreen K; Jannesari, Ali (January 2025, Association for Computational Linguistics)

Full Text Available
A Structure-Aware Framework for Learning Device Placements on Computation Graphs

Duan, Shukai; Ping, Heng; Kanakaris, Nikos; Xiao, Xiongye; Kyriakis, Panagiotis; Ahmed, Nesreen K; Zhang, Peiyu; Ma, Guixiang; Capotă, Mihai; Nazarian, Shahin; et al (December 2024, NeurIPS)

Computation graphs are Directed Acyclic Graphs (DAGs) where the nodes correspond to mathematical operations and are used widely as abstractions in optimizations of neural networks. The device placement problem aims to identify optimal allocations of those nodes to a set of (potentially heterogeneous) devices. Existing approaches rely on two types of architectures known as grouper-placer and encoder-placer, respectively. In this work, we bridge the gap between encoder-placer and grouper-placer techniques and propose a novel framework for the task of device placement, relying on smaller computation graphs extracted from the OpenVINO toolkit. The framework consists of five steps, including graph coarsening, node representation learning and policy optimization. It facilitates end-to-end training and takes into account the DAG nature of the computation graphs. We also propose a model variant, inspired by graph parsing networks and complex network analysis, enabling graph representation learning and jointed, personalized graph partitioning, using an unspecified number of groups. To train the entire framework, we use reinforcement learning using the execution time of the placement as a reward. We demonstrate the flexibility and effectiveness of our approach through multiple experiments with three benchmark models, namely Inception-V3, ResNet, and BERT. The robustness of the proposed framework is also highlighted through an ablation study. The suggested placements improve the inference speed for the benchmark models by up to over CPU execution and by up to compared to other commonly used baselines.
more » « less
Full Text Available
A probabilistic approach to discovering dynamic full-brain functional connectivity patterns

https://doi.org/10.1016/j.neuroimage.2018.01.071

Manning, Jeremy R.; Zhu, Xia; Willke, Theodore L.; Ranganath, Rajesh; Stachenfeld, Kimberly; Hasson, Uri; Blei, David M.; Norman, Kenneth A. (October 2018, NeuroImage)

Full Text Available
BrainIAK: The Brain Imaging Analysis Kit

https://doi.org/10.52294/31bb5b68-2184-411b-8c00-a1dacb61e1da

Kumar, Manoj; Anderson, Michael J.; Antony, James W.; Baldassano, Christopher; Brooks, Paula P.; Cai, Ming Bo; Chen, Po-Hsuan Cameron; Ellis, Cameron T.; Henselman-Petrusek, Gregory; Huberdeau, David; et al (January 2021, Aperture Neuro)

Functional magnetic resonance imaging (fMRI) offers a rich source of data for studying the neural basis of cognition. Here, we describe the Brain Imaging Analysis Kit (BrainIAK), an open-source, free Python package that provides computationally optimized solutions to key problems in advanced fMRI analysis. A variety of techniques are presently included in BrainIAK: intersubject correlation (ISC) and intersubject functional connectivity (ISFC), functional alignment via the shared response model (SRM), full correlation matrix analysis (FCMA), a Bayesian version of representational similarity analysis (BRSA), event segmentation using hidden Markov models, topographic factor analysis (TFA), inverted encoding models (IEMs), an fMRI data simulator that uses noise characteristics from real data (fmrisim), and some emerging methods. These techniques have been optimized to leverage the efficiencies of high-performance compute (HPC) clusters, and the same code can be seamlessly transferred from a laptop to a cluster. For each of the aforementioned techniques, we describe the data analysis problem that the technique is meant to solve and how it solves that problem; we also include an example Jupyter notebook for each technique and an annotated bibliography of papers that have used and/or described that technique. In addition to the sections describing various analysis techniques in BrainIAK, we have included sections describing the future applications of BrainIAK to real-time fMRI, tutorials that we have developed and shared online to facilitate learning the techniques in BrainIAK, computational innovations in BrainIAK, and how to contribute to BrainIAK. We hope that this manuscript helps readers to understand how BrainIAK might be useful in their research.
more » « less
Full Text Available

Search for: All records