DOI: 10.1093/ecco-jcc/jjad134 ISSN:

Alternative endoscopy reading paradigms determine score reliability and effect size in ulcerative colitis

Walter Reinisch, Vivek Pradhan, Saira Ahmad, Zhen Zhang, Jeremy D Gale
  • Gastroenterology
  • General Medicine



Central reading of endoscopy is advocated by regulatory agencies for clinical trials in ulcerative colitis [UC]. It is uncertain whether the local/site reader should be included in the reading paradigm. We explore whether utilizing locally- and centrally-determined endoscopic Mayo scores (eMS) provide a reliable final assessment and whether the paradigm used has an impact on effect size.


eMS data from the TURANDOT (NCT01620255) study were used to retrospectively examine seven different reading paradigms (utilizing the scores of local readers [LR], first central readers [CR1], second central readers [CR2], and various consensus reads [ConCR]) by assessing inter-rater reliabilities and their impact on the key study endpoint, endoscopic improvement.


More than 40% of eMS scores between two trained central readers were discordant. Central readers had wide variability in scorings at Baseline (intraclass correlation coefficient (ICC) of 0.475 [0.339, 0.610] for CR1 vs. CR2. Centrally-read scores had variable concordance with LR (LR vs. CR1 ICC 0.682 [0.575, 0.788], and LR vs. CR2 ICC 0.526 [0.399, 0.653]). Reading paradigms with LR and CR that included a consensus, enhanced ICC estimates to > 0.8. At Week 12, without the consensus reads, the CR1 vs. CR2 ICC estimates were 0.775 [0.710, 0.841], and with consensus reads the ICC estimates were > 0.9. Consensus-based approaches were most favourable to detect a treatment difference.


The ICC between the eMS of two trained and experienced central readers is unexpectedly low which reinforces that currently used central reading processes are still associated with several weaknesses.

More from our Archive