DOI: 10.3390/met16070726 ISSN: 2075-4701

Benchmarking Convolutional Neural Network Architectures for Multi-Phase Semantic Segmentation: Challenges in Resolving Widmanstätten Ferrite Within Ferritic–Pearlitic Matrices

Fritz Backofen, Kristin Hockauf, Thorsten Halle

The welded bead bending test (WBBT) serves as a pivotal procedure for evaluating the crack-arrest capacity of structural steels used for construction of safety-critical infrastructure according to ZTV-ING Part 4 or Deutsche Bahn Standard 918 002-02. Previous research has established that light optical microscopy images (LOMs) of specimens that do not pass the WBBT are frequently characterised by a high prevalence of Widmanstätten ferrite within the base material. While convolutional neural networks (CNNs) have successfully classified WBBT outcomes based on LOMs, the implemented approaches did not enable simultaneous pixel-wise delineation required for accurate quantification of Widmanstätten ferrite. However, the tonal similarity between Widmanstätten ferrite and polygonal ferrite renders conventional intensity-based thresholding ineffective for differentiation, necessitating a morphology-based segmentation approach. The present study proposes an automated semantic segmentation framework for the precise delineation of three microstructural phases present within the WBBT LOMs: polygonal ferrite, pearlite, and Widmanstätten ferrite. Utilising a dataset of 20 LOMs and corresponding manually annotated masks, five UNet encoder backbones were evaluated: ResNet-152, Xception, SE-ResNet-50, DenseNet-169, and EfficientNet-B5. To identify the optimal configuration for this microstructural segmentation task, each architecture was assessed using three distinct weight initialisation strategies: (A) ImageNet, (B) MicroNet, and (C) a combined approach. For Widmanstätten ferrite segmentation at patch level, DenseNet-169 pretrained on ImageNet achieves the best performance (Dice: 50.75% ± 3.38%, IoU: 38.45% ± 3.01%). Following inference-based aggregation, Xception pretrained on ImageNet yields improved results (Dice: 68.74% ± 1.12%, IoU: 52.52% ± 1.28%), with an MAE of 4.55% ± 0.80%.

More from our Archive