Study on small sample prediction of skeleton curve of concrete‐filled double skin steel tubular columns with pitting corrosion
Mingqiang Wang, Jiren Li, Ping Lyu, Shengxuan DingAbstract
Aiming to address small‐sample prediction for skeleton curves of pitted circular concrete‐filled double‐skin steel tubular (CFDST) columns, this study proposes an integrated methodology combining experimental investigation, finite element modeling, data augmentation, and machine learning. A refined FE model incorporating random pitting characteristics was developed and validated against tests with 10% and 20% corrosion rates, establishing an initial database of 126 samples. To overcome data scarcity, a Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN‐GP) was employed for data augmentation, expanding the dataset to 626 samples while preserving original distribution characteristics. Systematic evaluation of machine learning algorithms revealed XGBoost achieved optimal performance in predicting skeleton curve key points. Under small‐sample conditions, significant improvements were observed: R 2 increased from 0.94 to 0.987, while RMSE, MAE, and MSE decreased by approximately 17%, 18%, and 60%, respectively. Prediction error remained consistently below 6.5%. SHAP interpretability analysis revealed that geometric parameters—particularly outer diameter—exert more dominant influence on peak load than corrosion characteristics. Corrosion rate exhibits strong negative correlation with no saturation effect, while corrosion length shows a threshold effect beyond approximately 1000 mm where marginal impact diminishes, consistent with local buckling mechanics. This research demonstrates that integrating data augmentation with model optimization provides an effective solution for small‐sample prediction challenges, offering a novel approach for mechanical performance evaluation of corroded steel structures where experimental data are inherently limited.