DOI: 10.1093/jrsssb/qkae111 ISSN: 1369-7412

Robust angle-based transfer learning in high dimensions

Tian Gu, Yi Han, Rui Duan

Abstract

Transfer learning improves target model performance by leveraging data from related source populations, especially when target data are scarce. This study addresses the challenge of training high-dimensional regression models with limited target data in the presence of heterogeneous source populations. We focus on a practical setting where only parameter estimates of pretrained source models are available, rather than individual-level source data. For a single source model, we propose a novel angle-based transfer learning (angleTL) method that leverages concordance between source and target model parameters. AngleTL adapts to the signal strength of the target model, unifies several benchmark methods, and mitigates negative transfer when between-population heterogeneity is large. We extend angleTL to incorporate multiple source models, accounting for varying levels of relevance among them. Our high-dimensional asymptotic analysis provides insights into when a source model benefits the target model and demonstrates the superiority of angleTL over other methods. Extensive simulations validate these findings and highlight the feasibility of applying angleTL to transfer genetic risk prediction models across multiple biobanks.