DOI: 10.3390/brainsci16060652 ISSN: 2076-3425

Agreement and Calibration Between FreeSurfer and Visually Quality-Controlled FSL/FAST–ALVIN Lateral Ventricle Volumetry in a Population-Based MRI Cohort

Daniel Cantré, Felix Streckenbach, Sönke Langner, Thomas Beyer

Background/Objectives. Automated lateral ventricle volumetry is increasingly used in population-based neuroimaging, but correlation between methods does not establish agreement of absolute volumes. We quantified agreement and calibration between FreeSurfer and a visually quality-controlled FSL/FAST–ALVIN lateral ventricle workflow within the Study of Health in Pomerania (SHIP). Methods. This cross-sectional agreement-and-calibration study included 2988 SHIP participants with visually accepted FSL/FAST–ALVIN total lateral ventricle volumes; paired FreeSurfer data were available for 1913 participants. FSL/FAST–ALVIN was treated as the study reference scale rather than biological ground truth. Agreement was assessed using Pearson and Spearman correlations, Bland–Altman analysis, log-ratio agreement, Lin’s concordance correlation coefficient, and a two-way mixed-effects single-measure absolute agreement intraclass correlation coefficient. Directional calibration models predicted FSL/FAST–ALVIN volume from FreeSurfer volume and were internally validated using 2000 bootstrap resamples. Results. In the paired sample, volumes were almost perfectly associated (Pearson r = 0.9978; Spearman ρ = 0.9974), but FreeSurfer yielded systematically lower values (mean FreeSurfer-minus-FSL bias, −3.02 mL; 95% limits of agreement, −4.52 to −1.53 mL; geometric mean FreeSurfer/FSL ratio, 0.844). Lin’s concordance coefficient and the absolute agreement ICC were both 0.9598. Calibration was strong but workflow-specific: FSL/FAST–ALVIN volume = 2.611 + 1.0210 × FreeSurfer volume (R2 = 0.9955; optimism-corrected RMSE = 0.732 mL). Conclusions. FreeSurfer and visually quality-controlled FSL/FAST–ALVIN preserved participant ranking extremely well but were not directly interchangeable as absolute measurements. Cross-workflow comparisons require explicit method reporting, formal agreement analysis, and calibration to the intended measurement scale; the equation should not be used as a universal conversion formula outside comparable acquisition, segmentation, QC and software settings.

More from our Archive