An Explainable Machine Learning Framework for Lost Circulation Type Diagnosis in Drilling Engineering
Huayan Mu, Guancheng Jiang, Jinsheng Sun, Tengfei Dong, Jinshu Wang, Shengming Huang, Jun YangSummary
Lost circulation is one of the most critical and knowledge-intensive challenges in drilling engineering, often leading to severe safety hazards such as wellbore instability, stuck pipe, and blowouts. Accurate post-event diagnosis of lost-circulation types is essential for subsequent treatment planning and data-informed decision-making in drilling operations. To address the challenges of limited samples and imbalanced data, we develop an explainable machine learning framework for post-event lost-circulation-type diagnostic classification, systematically integrating empirical model selection, class imbalance handling, and multilevel interpretability. The synthetic minority oversampling technique for nominal and continuous (SMOTENC) is adopted to mitigate class imbalance, and the categorical boosting (CatBoost) classifier is selected based on comparative evaluation. To enhance interpretability and transparency, multilevel interpretability is achieved through Shapley additive explanations (SHAP)-based analysis, examining feature contributions at the global, class-specific, and instance levels. The proposed framework not only improves classification performance under imbalanced conditions but also strengthens the knowledge interpretability and engineering traceability of the model outputs. The results demonstrate that the proposed framework achieves a weighted F1-score of 81.96%, while the area under the receiver operating characteristic (ROC) curve (AUC) improves from 0.8754 to 0.9543, with a corresponding accuracy of 82.14%. The interpretability analysis indicates that lithology, shear stress, equivalent circulating density (ECD), and rheological properties are the most important predictive parameters in type discrimination. The proposed framework provides a transparent, explainable approach for post-event lost-circulation type diagnosis based on recorded operational and geological observations, contributing to the development of intelligent and sustainable drilling systems.