Explainable machine learning for patient‐specific quality assurance in intensity‐modulated radiotherapy based on anatomical structures
Xuerou Zhang, Ying Huang, Jie Wang, Xingtong Zhang, Hua Chen, Yehui Luo, Jianhao Xie, Zhiyong Xu, Yunhua XuAbstract
Background
Patient‐specific quality assurance (PSQA) plays a pivotal role in intensity‐modulated radiotherapy (IMRT) to ensure accurate dose delivery. However, conventional measurement‐based PSQA approaches are labor‐intensive and provide limited insight into the underlying factors contributing to variations in gamma passing rates (GPRs). Anatomical characteristics of the planning target volume (PTV) and organs at risk (OARs) may contain predictive information relevant to GPR performance, yet their potential has not been fully explored within interpretable machine learning frameworks.
Purpose
This study aimed to develop an interpretable machine learning (ML) framework for predicting GPRs in IMRT based on anatomical features extracted from the PTV and OARs.
Methods
A retrospective cohort of 243 clinical chest IMRT plans was analyzed. Radiomic and dosimetric features were extracted for each anatomical structure. Two ML regression models—Random Forest (RF) and eXtreme Gradient Boosting (XGBoost)—were developed to predict GPRs for the PTV and OARs under four gamma criteria (3%/3 mm, 3%/2 mm, 2%/3 mm, and 2%/2 mm). The GPR obtained by comparing the dose distribution reconstructed using the independent Monte Carlo (MC) dose calculation software ArcherQA (Wisdom Technology Company Limited, Hefei, China)—based on linear accelerator delivery log files—with the original planned dose distribution was used as the reference standard, and calculated using global gamma analysis with a 10% dose threshold. Model performance was evaluated using the mean absolute error (MAE), root mean square error (RMSE), and Spearman's rank correlation coefficient. Shapley Additive Explanations (SHAP) were applied to interpret feature contributions in the best‐performing model.
Results
Both models demonstrated robust predictive performance across different anatomical structures and gamma criteria. As the gamma criteria became less stringent, prediction errors decreased accordingly. Prediction accuracy was relatively high for OARs; for example, under the 3%/3 mm criterion, the test‐set MAE was 0.06% ± 0.01% for the heart and 0.26% ± 0.04% for the whole lung. In contrast, the prediction error was relatively larger for the PTV, with a test‐set MAE of 1.98% ± 0.31% under the same criterion. SHAP analysis revealed that texture‐related radiomic features contributed most substantially to model predictions. Moreover, feature importance patterns varied according to organ type and gamma‐criterion stringency.
Conclusions
Multi‐omics descriptors derived from anatomical structures can reliably predict GPRs in IMRT. The proposed interpretable ML framework not only achieves accurate prediction but also enhances mechanistic understanding through SHAP‐based explanations. These findings provide valuable insights into dose verification variability and offer a practical, transparent tool for IMRT patient‐specific quality assurance.