Geometric shift–aware uncertainty quantification for discharge coefficient prediction of triangular planform weirs
Zeroual Abdelatif, Fourar AliAbstract
Reliable estimation of the discharge coefficient Cd is central to sharp-crested weir hydraulics, yet data-driven models are commonly evaluated with random train-test splits that can overstate generalization when experiments are structured by discrete geometries. Using a reconstructed run-level dataset for triangular planform weirs (n=123) spanning vertex angles θ ∈{30o,60 o,90 o,120 o,150 o,180 o }, this study benchmarks tabular regression under (i) random 80/20 splitting, which represents interpolation within a mixed-geometry dataset, and (ii) a leave-one-angle-out (LOAO) protocol that enforces geometric domain shift. Dimensionless predictors are derived from Buckingham-π reasoning and formulated to remain numerically stable as θ →180°. Under random splitting, gradient boosting achieves high accuracy, whereas LOAO reveals substantially weaker generalization, with the largest errors near the θ = 180° regime transition. For uncertainty quantification, global split conformal intervals are near-nominal under random splits but severely under-cover under LOAO, indicating that calibration residuals are not transferable across geometries. CV-pooled residual calibration restores LOAO coverage at the cost of substantially wider intervals. Finally, combining k-nearest-neighbor distance-based out-of-distribution detection with a two-tier interval policy yields intermediate operating points that better balance reliability and sharpness when deploying beyond the calibration envelope. Overall, the results indicate that apparently strong interpolation performance can conceal important geometric extrapolation failures, and that safe deployment on novel weir geometries requires explicitly shift-aware uncertainty quantification.