2Development of a polygenic risk score to predict recurrence and survival in patients with resected cholangiocarcinoma
Ainhoa Lapitz, Beatriz Val, Tania Pastor, Laura Izquierdo-Sanchez, Colm J O’Rourke, Alejandro Forner, Cedric Coulouarn, Cedric Menard, Giovanni Brandi, Simona Tavolari, Lewis Roberts, Ahmed Fowsiyo, Mohamed Ali, Rocío I R Macias, Vincenzo Cardinale, Jesper B Andersen, Adelaida La Casta, Luis Bujanda, Maria J Perugorria, Raúl Jimenez-Aguero, Jesus M Banales, Pedro M RodriguesAbstract
Background and Aims
Cholangiocarcinoma (CCA) is a rare and aggressive malignancy originating from the bile ducts, characterized by a poor prognosis and limited treatment options. Potentially curative interventions include surgical resection with negative margins (R0) and liver transplantation. However, despite these treatments, the likelihood of post-surgical recurrence remains high, with 50–70% of patients experiencing disease recurrence within five years. This highlights the urgent need for reliable prognostic biomarkers to guide clinical decision-making and personalize postoperative care. Numerous prognostic biomarkers identified in tumor tissue for CCA have been proposed, but many studies suffer from limitations such as single-center designs, small patient cohorts, and semi-quantitative techniques such as immunohistochemistry. These methods often rely on observer interpretation and antibody specificity, which can compromise objectivity and reproducibility. To overcome these limitations, we aimed to identify and validate robust quantitative prognostic biomarkers for CCA by analyzing tumor mRNA expression, integrating them using machine learning (ML), and developing a web-based tool to predict tumor recurrence and overall survival (OS).
Method
A systematic literature review was conducted to identify candidate prognostic tissue biomarkers for CCA, which were further validated using Cox proportional hazards (PH) OS analyses in five independent transcriptomic cohorts. An international multicenter cohort of 221 patients undergoing tumor resection was assembled, and tumor mRNA expression was analyzed using high-throughput TaqMan OpenArray RT-qPCR. Associations between clinical and molecular features and recurrence-free survival (RFS) and OS were assessed using Cox PH analyses. Six ML survival models were trained and evaluated using nested cross-validation, followed by feature reduction. Shapley Additive exPlanations (SHAP) values were applied to assess feature importance and enhance model interpretability.
Results
Among 496 candidate CCA prognostic biomarkers identified from the literature, 52 showed prognostic value in Cox PH analyses across transcriptomic cohorts. In the multicenter validation cohort, integrative Cox PH analyses incorporating clinical variables and gene expression identified 11 genes significantly associated with RFS and 25 with OS. The 11 RFS-associated genes, combined with five clinical variables, were used to train six ML survival models, among which the random survival forest (RSF) model demonstrated the highest predictive accuracy. Following training of RSF models across all combinations of four genes and five clinical variables, and subsequent feature importance–based selection, a final model incorporating three genes (FSCN1, LDHA, and SLC2A1) and four clinical variables (CCA subtype, resection margin status, biological sex, and age at surgery) achieved a testing time-dependent AUC of 0.82. This model significantly stratified patients into high- and low recurrence-risk groups, with a median RFS (mRFS) of 8 months and 70 months, respectively. In addition, an OS prediction model using the same feature set achieved a testing AUC of 0.775 and significantly stratified patients according to 5-year OS risk, with a median OS (mOS) of 23 months in the high-risk group and 77 months in the low-risk group.
Conclusion
We identified and validated a robust panel of prognostic tissue biomarkers in patients with resected CCA. Integration of these biomarkers with clinical variables into ML-based prognostic models enabled accurate prediction of RFS and OS. The resulting web-based tool allows clinicians to input RT-qPCR results for the validated biomarkers and obtain precise, individualized prognostic predictions for postoperative recurrence and survival.