DOI: 10.1108/ecam-10-2025-1646 ISSN: 0969-9988

Application of stacking ensemble machine learning algorithm in predicting delays in water construction projects

Yasaman Aliakbarpour Garmroudi, Esmatullah Noorzai

Purpose

Delays in water construction projects trigger severe financial losses and societal setbacks. This study pioneers a cutting-edge stacking ensemble machine learning model to predict delay severity with unprecedented precision, empowering project managers to mitigate risks and drive sustainable infrastructure development.

Design/methodology/approach

Leveraging a robust literature review and 439 real water project contracts, five critical features – project duration, cost, climate zone, change costs, and adjustment costs – were meticulously selected. Data underwent rigorous preprocessing (standardization, Elliptic Envelope outlier detection) using scikit-learn. Four base learners (ANN, Decision Tree, Random Forest, KNN) were optimized via grid search, integrated into a stacking model with Random Forest as the meta-learner, and validated through repeated stratified 5-fold cross-validation.

Findings

The stacking model achieves remarkable performance (Accuracy: 0.957, F1-score: 0.957, Kappa: 0.935), outperforming individual algorithms by up to 5.5% and surpassing prior benchmarks. It excels in critical delay classes (4.4% error for 30–60%), enabling precise risk prediction and resource optimization.

Originality/value

This study revolutionizes delay forecasting by applying stacking ensemble learning to water projects for the first time, using real contract data to eliminate bias and overfitting. It delivers a transformative framework for proactive planning, cost-efficient buffering, and resilient project delivery, redefining construction management.

More from our Archive