DOI: 10.3390/biomedicines14071459 ISSN: 2227-9059

Toward Explainable Precision Nephrology: Machine Learning-Based Chronic Kidney Disease Prediction

Moiz Qureshi, Akm Azad, Hasnain Iftikhar, Paulo Canas Rodrigues

Background/Objectives: Chronic kidney disease (CKD) is an incurable and progressive condition; if diagnosed at an early stage, it would significantly reduce the risk of complications and enhance the outcomes for the patient. Methods: In this study, a custom dataset of 380 instances and 20 clinical attributes was used to develop and evaluate the machine learning (ML) models for reliable CKD prediction and to enhance the interpretability using explainable artificial intelligence (XAI) techniques. Artificial neural networks, C5.0, CHAID, logistic regression, linear support vector machines (L1 and L2 regularization), k-nearest neighbors (KNN), random tree, and deep neural networks were implemented. Correlation-based methods, recursive feature elimination, and LASSO were used for feature selection. SMOTE and SMOTETomek resampling techniques were used to address class imbalance. Three experimental set-ups were considered: (i) using SMOTETomek, (ii) with and without SMOTE, and (iii) grouped features according to the strength of correlation (high, moderate, low). Accuracy, precision, recall, F1 Score, AUC, and Gini index were used to evaluate the model’s performance. The pipeline was implemented in Python using the scikit-learn and imbalanced-learn packages. Results: Using SHAP and LIME, model interpretability was improved, with the KNN classifier obtaining the highest accuracy of 94.74% without SMOTE, and the C5.0 model obtained the highest accuracy of 92.98% with SMOTE. In the feature-group experiments, the L1-regularized linear SVM achieved high accuracy (89.47%) with highly correlated features. In general, both resampling methods improved model robustness, and feature selection methods reduced the model’s dimensionality with little loss in performance. Conclusions: The ML framework proposed is promising in predicting CKD with high accuracy and interpretability with relevance. By combining feature selection with class balancing and explainable AI, the model’s performance improves, and its clinical trustworthiness is enhanced. The results indicate the potential in using ML-based decision support systems for early-stage CKD diagnosis and personalized healthcare.

More from our Archive