Transfer learning refers to the transfer of knowledge or information from a relevant source domain to a target domain. However, most existing transfer learning theories and algorithms focus on IID tasks, where the source/target samples are assumed to be independent and identically distributed. Very little effort is devoted to theoretically studying the knowledge transferability on non-IID tasks, e.g., cross-network mining. To bridge the gap, in this paper, we propose rigorous generalization bounds and algorithms for cross-network transfer learning from a source graph to a target graph. The crucial idea is to characterize the cross-network knowledge transferability from the perspective of the Weisfeiler-Lehman graph isomorphism test. To this end, we propose a novel Graph Subtree Discrepancy to measure the graph distribution shift between source and target graphs. Then the generalization error bounds on cross-network transfer learning, including both cross-network node classification and link prediction tasks, can be derived in terms of the source knowledge and the Graph Subtree Discrepancy across domains. This thereby motivates us to propose a generic graph adaptive network (GRADE) to minimize the distribution shift between source and target graphs for cross-network transfer learning. Experimental results verify the effectiveness and efficiency of our GRADE framework on both cross-network node classification and cross-domain recommendation tasks.
more »
« less
Evaluating the Values of Sources in Transfer Learning
Transfer learning that adapts a model trained on data-rich sources to low-resource targets has been widely applied in natural language processing (NLP). However, when training a transfer model over multiple sources, not every source is equally useful for the target. To better transfer a model, it is essential to understand the values of the sources. In this paper, we develop , an efficient source valuation framework for quantifying the usefulness of the sources (e.g., ) in transfer learning based on the Shapley value method. Experiments and comprehensive analyses on both cross-domain and cross-lingual transfers demonstrate that our framework is not only effective in choosing useful transfer sources but also the source values match the intuitive source-target similarity.
more »
« less
- Award ID(s):
- 1927554
- PAR ID:
- 10294389
- Date Published:
- Journal Name:
- Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
- Page Range / eLocation ID:
- 5084 to 5116
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Recent years have witnessed a growing body of research on autonomous activity recognition models for use in deployment of mobile systems in new settings such as when a wearable system is adopted by a new user. Current research, however, lacks comprehensive frameworks for transfer learning. Specifically, it lacks the ability to deal with partially available data in new settings. To address these limitations, we propose {\it OptiMapper}, a novel uninformed cross-subject transfer learning framework for activity recognition. OptiMapper is a combinatorial optimization framework that extracts abstract knowledge across subjects and utilizes this knowledge for developing a personalized and accurate activity recognition model in new subjects. To this end, a novel community-detection-based clustering of unlabeled data is proposed that uses the target user data to construct a network of unannotated sensor observations. The clusters of these target observations are then mapped onto the source clusters using a complete bipartite graph model. In the next step, the mapped labels are conditionally fused with the prediction of a base learner to create a personalized and labeled training dataset for the target user. We present two instantiations of OptiMapper. The first instantiation, which is applicable for transfer learning across domains with identical activity labels, performs a one-to-one bipartite mapping between clusters of the source and target users. The second instantiation performs optimal many-to-one mapping between the source clusters and those of the target. The many-to-one mapping allows us to find an optimal mapping even when the target dataset does not contain sufficient instances of all activity classes. We show that this type of cross-domain mapping can be formulated as a transportation problem and solved optimally. We evaluate our transfer learning techniques on several activity recognition datasets. Our results show that the proposed community detection approach can achieve, on average, 69%$ utilization of the datasets for clustering with an overall clustering accuracy of 87.5%. Our results also suggest that the proposed transfer learning algorithms can achieve up to 22.5% improvement in the activity recognition accuracy, compared to the state-of-the-art techniques. The experimental results also demonstrate high and sustained performance even in presence of partial data.more » « less
-
Abstract Most environmental data come from a minority of well‐monitored sites. An ongoing challenge in the environmental sciences is transferring knowledge from monitored sites to unmonitored sites. Here, we demonstrate a novel transfer‐learning framework that accurately predicts depth‐specific temperature in unmonitored lakes (targets) by borrowing models from well‐monitored lakes (sources). This method, meta‐transfer learning (MTL), builds a meta‐learning model to predict transfer performance from candidate source models to targets using lake attributes and candidates' past performance. We constructed source models at 145 well‐monitored lakes using calibrated process‐based (PB) modeling and a recently developed approach called process‐guided deep learning (PGDL). We applied MTL to either PB or PGDL source models (PB‐MTL or PGDL‐MTL, respectively) to predict temperatures in 305 target lakes treated as unmonitored in the Upper Midwestern United States. We show significantly improved performance relative to the uncalibrated PB General Lake Model, where the median root mean squared error (RMSE) for the target lakes is 2.52°C. PB‐MTL yielded a median RMSE of 2.43°C; PGDL‐MTL yielded 2.16°C; and a PGDL‐MTL ensemble of nine sources per target yielded 1.88°C. For sparsely monitored target lakes, PGDL‐MTL often outperformed PGDL models trained on the target lakes themselves. Differences in maximum depth between the source and target were consistently the most important predictors. Our approach readily scales to thousands of lakes in the Midwestern United States, demonstrating that MTL with meaningful predictor variables and high‐quality source models is a promising approach for many kinds of unmonitored systems and environmental variables.more » « less
-
Abstract Transfer learning refers to the process of adapting a model trained on a source task to a target task. While kernel methods are conceptually and computationally simple models that are competitive on a variety of tasks, it has been unclear how to develop scalable kernel-based transfer learning methods across general source and target tasks with possibly differing label dimensions. In this work, we propose a transfer learning framework for kernel methods by projecting and translating the source model to the target task. We demonstrate the effectiveness of our framework in applications to image classification and virtual drug screening. For both applications, we identify simple scaling laws that characterize the performance of transfer-learned kernels as a function of the number of target examples. We explain this phenomenon in a simplified linear setting, where we are able to derive the exact scaling laws.more » « less
-
We study the problem of transfer learning, observing that previous efforts to understand its information-theoretic limits do not fully exploit the geometric structure of the source and target domains. In contrast, our study first illustrates the benefits of incorporating a natural geometric structure within a linear regression model, which corresponds to the generalized eigenvalue problem formed by the Gram matrices of both domains. We next establish a finite-sample minimax lower bound, propose a refined model interpolation estimator that enjoys a matching upper bound, and then extend our framework to multiple source domains and generalized linear models. Surprisingly, as long as information is available on the distance between the source and target parameters, negative-transfer does not occur. Simulation studies show that our proposed interpolation estimator outperforms state-of-the-art transfer learning methods in both moderate- and high-dimensional settings.more » « less