skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A linear adjustment-based approach to posterior drift in transfer learning
Summary We present new models and methods for the posterior drift problem where the regression function in the target domain is modelled as a linear adjustment, on an appropriate scale, of that in the source domain, and study the theoretical properties of our proposed estimators in the binary classification problem. The core idea of our model inherits the simplicity and the usefulness of generalized linear models and accelerated failure time models from the classical statistics literature. Our approach is shown to be flexible and applicable in a variety of statistical settings, and can be adopted for transfer learning problems in various domains including epidemiology, genetics and biomedicine. As concrete applications, we illustrate the power of our approach (i) through mortality prediction for British Asians by borrowing strength from similar data from the larger pool of British Caucasians, using the UK Biobank data, and (ii) in overcoming a spurious correlation present in the source domain of the Waterbirds dataset.  more » « less
Award ID(s):
2027737 2052653 2113373 1830247
PAR ID:
10490531
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Biometrika
Volume:
111
Issue:
1
ISSN:
0006-3444
Format(s):
Medium: X Size: p. 31-50
Size(s):
p. 31-50
Sponsoring Org:
National Science Foundation
More Like this
  1. SUMMARY The British Isles' lithospheric structure, shaped by a dynamic geological history, remains incompletely understood, particularly regarding anelastic parameters, such as attenuation. In this study, we present a teleseismic attenuation model for the British Isles, using time-domain analysis of teleseismic P-wave data from 28 deep earthquakes. We constructed a 2-D differential attenuation map (Δt*) that reveals significant regional variations. Our findings show a weak anticorrelation between Δt* and shear wave velocity at upper mantle depths, suggesting that variations in the lithosphere–asthenosphere system influence this pattern. A high-attenuation zone extends from Scotland across the Irish Sea to southwest England, potentially linked to mantle upwelling associated with the Iceland plume. This model provides new insights into the mantle dynamics beneath the British Isles, offering a crucial reference for future geophysical studies in the region. 
    more » « less
  2. Evaluating the performance of machine learning models under distribution shifts is challenging, especially when we only have unlabeled data from the shifted (target) domain, along with labeled data from the original (source) domain. Recent work suggests that the notion of disagreement, the degree to which two models trained with different randomness differ on the same input, is a key to tackling this problem. Experimentally, disagreement and prediction error have been shown to be strongly connected, which has been used to estimate model performance. Experiments have led to the discovery of the disagreement-on-the-line phenomenon, whereby the classification error under the target domain is often a linear function of the classification error under the source domain; and whenever this property holds, disagreement under the source and target domain follow the same linear relation. In this work, we develop a theoretical foundation for analyzing disagreement in high-dimensional random features regression; and study under what conditions the disagreement-on-the-line phenomenon occurs in our setting. Experiments on CIFAR-10-C, Tiny ImageNet-C, and Camelyon17 are consistent with our theory and support the universality of the theoretical findings. 
    more » « less
  3. Abstract We study the problem of fitting a piecewise affine (PWA) function to input–output data. Our algorithm divides the input domain into finitely many regions whose shapes are specified by a user-provided template and such that the input–output data in each region are fit by an affine function within a user-provided error tolerance. We first prove that this problem is NP-hard. Then, we present a top-down algorithmic approach for solving the problem. The algorithm considers subsets of the data points in a systematic manner, trying to fit an affine function for each subset using linear regression. If regression fails on a subset, the algorithm extracts a minimal set of points from the subset (an unsatisfiable core) that is responsible for the failure. The identified core is then used to split the current subset into smaller ones. By combining this top-down scheme with a set-covering algorithm, we derive an overall approach that provides optimal PWA models for a given error tolerance, where optimality refers to minimizing the number of pieces of the PWA model. We demonstrate our approach on three numerical examples that include PWA approximations of a widely used nonlinear insulin–glucose regulation model and a double inverted pendulum with soft contacts. 
    more » « less
  4. The problem of adapting models from a source domain using data from any target domain of interest has gained prominence, thanks to the brittle generalization in deep neural networks. While several test-time adaptation techniques have emerged, they typically rely on synthetic data augmentations in cases of limited target data availability. In this paper, we consider the challenging setting of single-shot adaptation and explore the design of augmentation strategies. We argue that augmentations utilized by existing methods are insufficient to handle large distribution shifts, and hence propose a new approach SiSTA (Single-Shot Target Augmentations), which first fine-tunes a generative model from the source domain using a single-shot target, and then employs novel sampling strategies for curating synthetic target data. Using experiments with a state-of-the-art domain adaptation method, we find that SiSTA produces improvements as high as 20% over existing baselines under challenging shifts in face attribute detection, and that it performs competitively to oracle models obtained by training on a larger target dataset. Our codes can be accessed at github.com/kowshikthopalli/SISTA. 
    more » « less
  5. null (Ed.)
    Human activity recognition (HAR) from wearable sensors data has become ubiquitous due to the widespread proliferation of IoT and wearable devices. However, recognizing human activity in heterogeneous environments, for example, with sensors of different models and make, across different persons and their on-body sensor placements introduces wide range discrepancies in the data distributions, and therefore, leads to an increased error margin. Transductive transfer learning techniques such as domain adaptation have been quite successful in mitigating the domain discrepancies between the source and target domain distributions without the costly target domain data annotations. However, little exploration has been done when multiple distinct source domains are present, and the optimum mapping to the target domain from each source is not apparent. In this paper, we propose a deep Multi-Source Adversarial Domain Adaptation (MSADA) framework that opportunistically helps select the most relevant feature representations from multiple source domains and establish such mappings to the target domain by learning the perplexity scores. We showcase that the learned mappings can actually reflect our prior knowledge on the semantic relationships between the domains, indicating that MSADA can be employed as a powerful tool for exploratory activity data analysis. We empirically demonstrate that our proposed multi-source domain adaptation approach achieves 2% improvement with OPPORTUNITY dataset (cross-person heterogeneity, 4 ADLs), whereas 13% improvement on DSADS dataset (cross-position heterogeneity, 10 ADLs and sports activities). 
    more » « less