Improved prediction of childhood anemia using hybrid ensemble learning and dual-level explainability
Ashima Kukkar, Julius Moinget Loibor, Mehmet Akif Cifci, Manash Pratim Barman, Sadiq Hussain, Silvia Gaftandzhieva, Rositsa DonevaPurpose
Millions of children under five, particularly in low- and middle-income countries, suffer from preventable anemia, making early detection critical for improving public health outcomes. This study proposes an interpretable machine learning framework for early prediction of childhood anemia using structured healthcare data.
Methods
The proposed approach integrates TabNet, XGBoost, and Multi-Layer Perceptron (MLP) within a stacked ensemble architecture, with logistic regression as a meta-learner. Hyperparameter optimization is performed using GridSearchCV and compared with RandomizedSearchCV, Optuna, Genetic Algorithms (GA), Particle Swarm Optimization (PSO), and Ant Colony Optimization (ACO). SHAP provided global feature importance, and LIME explained individual predictions. The system was trained and tested using the Tanzania Demographic and Health Survey (DHS) dataset.
Results
Experimental results on the Tanzania Demographic and Health Survey (TDHS 2022) dataset demonstrate that the proposed ensemble significantly outperforms individual models, achieving an accuracy of 98.5% and high precision, recall, and F1-scores across both classes. Among optimization strategies, GridSearchCV and Optuna provide the most consistent and optimal performance. SHAP-based global analysis identifies key predictors such as child age, wealth index, and breastfeeding status, while LIME offers instance-level explanations, enhancing model transparency and clinical interpretability. Comparative evaluation with baseline models and ablation analysis confirms the effectiveness of the stacked architecture and meta-learning strategy. External validation using NFHS (India) data shows close agreement (±1–2%) with observed anemia prevalence trends, indicating cross-regional generalizability. The study also discusses ethical considerations, fairness implications, and deployment strategies for integration into healthcare systems.
Conclusions
The proposed framework is accurate, transparent, and adaptable. It supports early anemia detection and can assist clinical decision-making, especially in under-resourced regions, making it a valuable tool for global public health applications.