DOI: 10.3390/s26134109 ISSN: 1424-8220

Metadata Conditioning via Feature-Wise Linear Modulation for Multi-Domain 3D Abdominal Segmentation: A Comparative Study Between U-Net and nnU-Net

Juan F. Garrido-Martínez, Javier M. Garrido-López, Juan F. Zapata-Pérez, Juan Martínez-Alajarín

Automatic 3D abdominal organ segmentation in magnetic resonance imaging presents a major challenge: data heterogeneity, caused by differences in acquisition protocol, contrast, spatial resolution, and imaging characteristics, which can degrade model generalization. This work investigates the use of FiLM (Feature-wise Linear Modulation) as a metadata-conditioning mechanism to improve the contextual adaptation of 3D segmentation networks. To this end, two reference architectures, U-Net and nnU-Net v2, were compared in both their baseline versions and their FiLM-conditioned variants. The evaluation was carried out on the CHAOS and AMOS datasets, as well as on a combined CHAOS+AMOS setting to analyze multi-domain robustness. The results show that the effect of FiLM clearly depends on the underlying architecture. In U-Net, FiLM produced favourable quantitative trends in the final post-processed predictions, with a small gain in CHAOS, a clearer improvement in AMOS, and the largest effect in the combined scenario, where the increase reached +0.0215 in clean Dice, together with reductions of −6.66 mm in HD95 and −3.85 mm in ASD. However, these improvements were not uniform across all metrics and organs. In contrast, in nnU-Net the conditioning did not provide consistent benefits and reduced performance in some scenarios, particularly in AMOS. In addition, when post-processing was applied to remove islands and disconnected components, its impact was much greater in U-Net than in nnU-Net, indicating that the latter produced cleaner and more stable segmentations directly from the raw output. Overall, this work suggests that FiLM can be useful to reinforce more generic architectures such as U-Net, especially when problem heterogeneity increases, but it was not consistently beneficial for nnU-Net, whose baseline already constituted the most robust and best-balanced model in the study. These results support the idea that the value of metadata conditioning is not universal, but instead depends on the base architecture and the generalization scenario under consideration.

More from our Archive