Quantifying interpretive contributions to analytical variability in clinical flow cytometry
Paul E. Mead, Baptiste Labarthe, Christy Embrey, Jeffery M. Klco, Kamila CzechowskaAbstract
Reproducibility in clinical flow cytometry is essential for diagnosis, longitudinal monitoring, and interlaboratory harmonization, particularly when threshold‐based lymphocyte measurements such as absolute CD4 counts guide clinical management. Although standardization of instruments, reagents, antibody panels, and acquisition protocols has reduced variability at the level of signal generation, variability introduced by human interpretation during manual gating has not been quantitatively separated from biological and technical sources. Here, we applied a two‐stage workflow audit anchored to a fixed, invariant automated analysis used as an analytical reference to quantify interpretive variability in routine clinical TBNK (T‐cell, B‐cell, and natural killer cell) flow cytometry and to assess its impact on CD4 decision‐band classification. Standardized quality‐control materials and 320 consecutive clinical samples, independently analyzed by six technologists on three harmonized cytometers, were used to define an empirical analytical performance envelope (i.e., the expected range of performance) and to evaluate departures from this envelope under routine conditions. Under quality‐control conditions, automated–manual differences were stable and interpretable: percentage‐based endpoints showed near‐zero bias with narrow dispersion, absolute counts exhibited small, consistent offsets, and CD4 decision‐band assignment was preserved. In contrast, routine clinical samples showed substantially greater dispersion that was structured primarily by operator identity rather than instrument configuration. Operator‐associated disagreement exceeded instrument‐associated variability, was low dimensional and reproducible, and was concentrated near established CD4 thresholds, yielding discordant but adjacent decision‐band classifications. These findings quantify a previously unmeasured source of analytical variability in routine hematology testing that can affect threshold‐based clinical classification despite acceptable quality‐control performance.