Explainable residual ensemble modelling for EuroQol-5 dimensions-based quality-of-life assessment and stratification in patients with knee osteoarthritis
Jaehyuk Lee, Sejun Oh, Jun Hwan Choi, Jin Taek Lee, Bo Ryun Kim, Sangheon LeeObjective
To develop an interpretable machine-learning framework for supporting quality-of-life (QoL) assessment and stratification in patients with knee osteoarthritis (OA) by integrating linear and nonlinear modelling strategies.
Methods
This retrospective study utilised de-identified clinical data from 1,102 patients with knee OA collected at a university hospital in South Korea between September 2013 and January 2022 and made available via the AI Hub platform. QoL was assessed using the EuroQol-5 Dimensions (EQ-5D) index and dichotomised at 0.7. A residual ensemble model combining logistic regression (LR) and a Random Forest residual learner was developed and evaluated using stratified train–test split and three-fold cross-validation. Model performance was assessed using accuracy, F 1 -score, ROC-AUC, and PR-AUC. Model interpretability was examined using LR coefficients and SHAP analysis.
Results
The proposed model achieved superior performance on the independent test set (accuracy = 0.88, precision = 0.85, recall = 0.83, F 1 = 0.84, ROC-AUC = 0.93, PR-AUC = 0.91), outperforming individual baseline models. Key predictors included functional limitation (WOMAC), pain severity (VAS), and surgical history (TKA). Incorporating interaction features further improved accuracy to 0.89 without compromising interpretability.
Conclusions
The proposed residual ensemble framework effectively balances predictive performance and interpretability, providing a clinically meaningful framework for QoL assessment and risk stratification in knee OA. This approach supports the development of explainable decision-support tools in digital musculoskeletal health.