skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Transfer-learning-based Autotuning using Gaussian Copula
As diverse high-performance computing (HPC) systems are built, many opportunities arise for applications to solve larger problems than ever before. Given the significantly increased complexity of these HPC systems and application tuning, empirical performance tuning, such as autotuning, has emerged as a promising approach in recent years. Despite its effectiveness, autotuning is often a computationally expensive approach. Transfer learning (TL)-based autotuning seeks to address this issue by leveraging the data from prior tuning. Current TL methods for autotuning spend significant time modeling the relationship between parameter configurations and performance, which is ineffective for few-shot (that is, few empirical evaluations) tuning on new tasks. We introduce the first generative TL-based autotuning approach based on the Gaussian copula (GC) to model the high-performing regions of the search space from prior data and then generate high-performing configurations for new tasks. This allows a sampling-based approach that maximizes few-shot performance and provides the first probabilistic estimation of the few-shot budget for effective TL-based autotuning. We compare our generative TL approach with state-of-the-art autotuning techniques on several benchmarks. We find that the GC is capable of achieving 64.37% of peak few-shot performance in its first evaluation. Furthermore, the GC model can determine a few-shot transfer budget that yields up to 33.39X speedup, a dramatic improvement over the 20.58X speedup using prior techniques.  more » « less
Award ID(s):
1942182
PAR ID:
10566581
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Publisher / Repository:
ACM
Date Published:
ISBN:
9798400700569
Page Range / eLocation ID:
37 to 49
Format(s):
Medium: X
Location:
Orlando FL USA
Sponsoring Org:
National Science Foundation
More Like this
  1. Accurate modeling of the gain spectrum in erbium-doped fiber amplifiers (EDFAs) is essential for optimizing optical network performance, particularly as networks evolve toward multi-vendor solutions. In this work, we propose a generalized few-shot transfer learning architecture based on a semi-supervised self-normalizing neural network (SS-NN) that leverages internal EDFA features—such as VOA input/output power and attenuation—to improve gain spectrum prediction. Our SS-NN model employs a two-phase training strategy comprising unsupervised pre-training with noise-augmented measurements and supervised fine-tuning with a custom-weighted MSE loss. Furthermore, we extend the framework with transfer learning (TL) techniques that enable both homogeneous (same-feature space) and heterogeneous (different-feature sets) model adaptation across booster, pre-amplifier, and ILA EDFAs. To address feature mismatches in heterogeneous TL, we incorporate a covariance matching loss to align second-order feature statistics between the source and target domains. Extensive experiments conducted across 26 EDFAs in the COSMOS and Open Ireland testbeds demonstrate that the proposed approach significantly reduces the number of measurement requirements on the system while achieving lower mean absolute errors and improved error distributions compared to benchmark methods. 
    more » « less
  2. Biscarat, C.; Campana, S.; Hegner, B.; Roiser, S.; Rovelli, C.I.; Stewart, G.A. (Ed.)
    The reconstruction of charged particle trajectories, known as tracking, is one of the most complex and CPU consuming parts of event processing in high energy particle physics experiments. The most widely used and best performing tracking algorithms require significant geometry-specific tuning of the algorithm parameters to achieve best results. In this paper, we demonstrate the usage of machine learning techniques, particularly evolutionary algorithms, to find high performing configurations for the first step of tracking, called track seeding. We use a track seeding algorithm from the software framework A Common Tracking Software (ACTS). ACTS aims to provide an experimentindependent and framework-independent tracking software designed for modern computing architectures. We show that our optimization algorithms find highly performing configurations in ACTS without hand-tuning. These techniques can be applied to other reconstruction tasks, improving performance and reducing the need for laborious hand-tuning of parameters. 
    more » « less
  3. Language models are achieving impressive performance on various tasks by aggressively adopting inference-time prompting techniques,such as zero-shot and few-shot prompting. In this work, we introduce EchoPrompt, a simple yet effective approach that prompts the model to rephrase its queries before answering them. EchoPrompt is tailored for four scenarios, including standard and chain-of-thought prompting, in both zero-shot and few-shot settings. Experimental results show that EchoPrompt yields substantial improvementsacross all these settings for four families of causal language models. These improvements are observed across various numerical reasoning (e.g., GSM8K, SVAMP), reading comprehension (e.g., DROP), and logical reasoning (e.g., Coin flipping) tasks. On average, EchoPrompt improves the Zero-shot-CoT performance of code-davinci-002 by 5% in numerical tasks and 13% in reading comprehension tasks. Our empirical results indicate that EchoPrompt is an effective technique that enhances in-context learning performance. 
    more » « less
  4. Many scientific domains gather sufficient labels to train machine algorithms through human-in-the-loop techniques provided by the this http URL citizen science platform. As the range of projects, task types and data rates increase, acceleration of model training is of paramount concern to focus volunteer effort where most needed. The application of Transfer Learning (TL) between Zooniverse projects holds promise as a solution. However, understanding the effectiveness of TL approaches that pretrain on large-scale generic image sets vs. images with similar characteristics possibly from similar tasks is an open challenge. We apply a generative segmentation model on two Zooniverse project-based data sets: (1) to identify fat droplets in liver cells (FatChecker; FC) and (2) the identification of kelp beds in satellite images (Floating Forests; FF) through transfer learning from the first project. We compare and contrast its performance with a TL model based on the COCO image set, and subsequently with baseline counterparts. We find that both the FC and COCO TL models perform better than the baseline cases when using >75% of the original training sample size. The COCO-based TL model generally performs better than the FC-based one, likely due to its generalized features. Our investigations provide important insights into usage of TL approaches on multi-domain data hosted across different Zooniverse projects, enabling future projects to accelerate task completion. 
    more » « less
  5. We study the problem of fine-tuning a language model (LM) for a target task by optimally using the information from n auxiliary tasks. This problem has broad applications in NLP, such as targeted instruction tuning and data selection in chain-of-thought fine-tuning. The key challenge of this problem is that not all auxiliary tasks are useful to improve the performance of the target task. Thus, choosing the right subset of auxiliary tasks is crucial. Conventional subset selection methods, such as forward & backward selection, are unsuitable for LM fine-tuning because they require repeated training on subsets of auxiliary tasks. This paper introduces a new algorithm to estimate model fine-tuning performances without repeated training. Our algorithm first performs multitask training using the data of all the tasks to obtain a meta initialization. Then, we approximate the model fine-tuning loss of a subset using functional values and gradients from the meta initialization. Empirically, we find that this gradient-based approximation holds with remarkable accuracy for twelve transformer-based LMs. Thus, we can now estimate fine-tuning performances on CPUs within a few seconds. We conduct extensive experiments to validate our approach, delivering a speedup of 30× over conventional subset selection while incurring only 1% error of the true fine-tuning performances. In downstream evaluations of instruction tuning and chain-of-thought fine-tuning, our approach improves over prior methods that utilize gradient or representation similarity for subset selection by up to 3.8%. 
    more » « less