Abstract P15: Context-aware foundation model of bulk transcriptomics for interpretable analysis of transcriptional dynamics and treatment response in AML
Yi Chai, Yang Li, Jianbiao Zhou, Wee Joo Chng, Yang ZhangAbstract
While language models extract linguistic structures from text, similar approaches can uncover biological rules from genetic patterns. Though these methods have shown promise in single-cell analysis, bulk transcriptomics remains underexplored despite offering distinct clinical advantages including preserved tissue-level information, higher sequencing depth, and cost-effectiveness. Here, we present a transformer-based foundation model leveraging transcriptomic profiles from over 30,000 diverse bulk RNA samples, including normal tissues and various cancer types. Unlike conventional language models, our model incorporates specialized modules for modelling pairwise gene interactions through a dual representation system that captures both gene-level features and their higher-order relationships. Our model shows robust performance across multiple downstream applications. It achieves zero-shot accuracy of 78.81% in cancer classification without fine-tuning and outperforms existing approaches in cancer stages prediction through simple fine-tuning. Notably, it can extract critical gene interaction networks without relying on prior biological knowledge. More importantly, we leverage it to introduce dynamic interpretations to static bulk transcriptomic data, successfully modelling logical gene regulation rules with 91.07% overall accuracy—reaching 100% for rules related to key genes like GATA2 and SCL. With the context-specific modelling ability, it also identifies, for example, transcriptional dynamics in normal haematopoiesis and dysregulated circuits during transition to leukemic states. We further demonstrate clinical utility in predicting patient response to first induction chemotherapy (AUROC=0.75) in acute myeloid leukemia, a challenging task due to patient and mechanism heterogeneity. Through our novel response-directed feature-space gradient ascent approach, we identify patient-specific gene expression modifications that could computationally redirect resistant phenotypes toward responsive ones, revealing potential therapeutic targets aligned with individual patients' clinical features. These results establish our model as a powerful framework to extract useful information from bulk transcriptomics data and has potential applications in precision medicine by connecting computational predictions with biological insights.
Citation Format:
Yi Chai, Yang Li, Jianbiao Zhou, Wee Joo Chng, Yang Zhang. Context-aware foundation model of bulk transcriptomics for interpretable analysis of transcriptional dynamics and treatment response in AML [abstract]. In: Proceedings of Frontiers in Cancer Science 2025; 2025 Nov 5-7; Singapore. Philadelphia (PA): AACR; Cancer Res 2026;86(13_Suppl):Abstract nr P15.