DOI: 10.1111/insr.12606 ISSN: 0306-7734

Chance‐Corrected Interrater Agreement Statistics for Two‐Rater Dichotomous Responses: A Method Review With Comparative Assessment Under Possibly Correlated Decisions

Zizhong Tian, Vernon M. Chinchilli, Chan Shen, Shouhao Zhou

Summary

Measurement of the interrater agreement (IRA) is critical for assessing the reliability and validity of ratings in various disciplines. While numerous IRA statistics have been developed, there is a lack of guidance on selecting appropriate measures especially when raters' decisions could be correlated. To address this gap, we review a family of chance‐corrected IRA statistics for two‐rater dichotomous‐response cases, a fundamental setting that not only serves as the theoretical foundation for categorical‐response or multirater IRA methods but is also practically dominant in most empirical studies, and we propose a novel data‐generating framework to simulate correlated decision processes between raters. Subsequently, a new estimand, which calibrates the ‘true’ chance‐corrected IRA, is introduced while accounting for the potential ‘probabilistic certainty’. Extensive simulations were conducted to evaluate the performance of the reviewed IRA methods under various practical scenarios and were summarised by an agglomerative hierarchical clustering analysis. Finally, we provide recommendations for selecting appropriate IRA statistics based on outcome prevalence and rater characteristics and highlight the need for further advancements in IRA estimation methodologies.

More from our Archive