Bridging Ancestry-Stratified Bias in Pharmacogenomics AI: Toward Metabolomics-Inclusive Multi-Omics Precision Medicine
Heayyean Lee, Khadijah Sajid, Dayeon LeePharmacogenomics AI offers significant potential for individualized drug therapy; however, its clinical benefits remain unevenly distributed. Models trained predominantly on European-ancestry data consistently underperform in non-European populations, with polygenic risk scores (PRS) showing an estimated 39–73% reduction in predictive accuracy in African-ancestry cohorts across complex traits. These disparities have driven increased interest in moving beyond single-layer genomic approaches. Multi-omics frameworks integrating genomic, transcriptomic, proteomic, and metabolomic data have emerged as a promising strategy to improve prediction across heterogeneous clinical populations, as each molecular layer provides distinct and complementary biological information. Among these layers, metabolomics may represent a particularly transferable component across populations. Metabolite profiles capture the downstream functional output of biological systems influenced by genetic, environmental, dietary, and microbiome-related factors, and may therefore be less reliant on ancestry-stratified allele frequency structures that underlie performance disparities in genomic models. This review synthesizes evidence regarding the mechanistic basis of genomic bias in pharmacogenomics AI, the emerging role of multi-omics integration, especially metabolomics, in improving predictive performance, and the current landscape of computational strategies for bias mitigation, including federated learning, transfer learning, domain adaptation, and synthetic data generation. Collectively, current evidence supports metabolomics-inclusive multi-omics frameworks as a biologically plausible, hypothesis-generating strategy to reduce reliance on ancestry-linked genomic features. However, direct evidence that such frameworks reduce ancestry-related bias in clinical AI outputs remains limited, underscoring the need for globally diverse datasets and prospective multi-population validation.