DOI: 10.1111/jebm.70141 ISSN: 1756-5383

Adherence of CLAIM in Artificial Intelligence Research: A Cross‐Sectional Study

Chenhao Gao, Yilan Zhang, Renjie Zhao, Zhongfan Liao, Xuhui Zhang, Wenjie Yang, Ying Wang, Bingran Zhang, Yonggang Zhang, Nian Li

ABSTRACT

Objective

To analyze the adherence of Checklist for Artificial Intelligence (AI) in Medical Imaging (CLAIM) in top medical imaging journals.

Methods

A search for AI research in top medical imaging journals was performed from Web of Science Core Collection. The adherence was assessed by the reporting score and compliance rate of CLAIM. Potential influencing factors were also analyzed.

Results

A total of 501 articles were included. After quality assessment, the median CLAIM score was 19 points (25th–75th: 17–21 points), and the median overall compliance rate was 51.4% (25th–75th: 46.2%–57.9%). Among the 42 items, 14 items had compliance rates of ≥80%, 11 items had compliance rates of 40%–79%, while 17 items had compliance rates of <40%. Low compliance rates were predominantly concentrated on items such as quality control of annotations, sample size determination, model robustness evaluation, failure analysis, and code/data availability. In terms of temporal trends, both the overall score ( ρ = 0.119, p = 0.008) and compliance rate ( ρ = 0.139, p = 0.002) showed weak positive correlations with the publication year. Furthermore, compliance for items 2 (structured abstract) ( ρ = 0.129, p = 0.004), 33 (participant flow diagram) ( ρ = 0.131, p = 0.003), 34 (demographic/clinical characteristics by partition) ( ρ = 0.122, p = 0.006), 41 (study protocol/data/code availability) ( ρ = 0.094, p = 0.036), and 42 (funding sources) ( ρ = 0.128, p = 0.004) demonstrated significant upward trends over time. Logistic regression analysis identified the following negative predictors of reporting quality: research objectives focused on image reconstruction (OR = 0.416, 95% CI: 0.203–0.842, p = 0.015), artifact reduction (OR = 0.072, 95% CI: 0.009–0.365, p = 0.004), multimodal imaging (OR = 0.301, 95% CI: 0.090–0.899, p = 0.038), use of public (OR = 0.508, 95% CI: 0.289–0.883, p = 0.017) and public–private hybrid (OR = 0.339, 95% CI: 0.146–0.759, p = 0.010) development datasets. Conversely, positive predictors were: publication year (OR = 1.246, 95% CI: 1.067–1.458, p = 0.006); being published in a CLAIM adopting journal (OR = 1.737, 95% CI: 1.070–2.481, p = 0.026); code availability (OR = 1.929, 95% CI: 1.181–3.186, p = 0.009); and development datasets sizes of 100–199 cases (OR = 2.187, 95% CI: 1.121–4.320, p = 0.023), 500–999 cases (OR = 3.409, 95% CI: 1.508–7.902, p = 0.004), and ≥1000 cases (OR = 2.798, 95% CI: 1.335–5.979, p = 0.007).

Conclusions

The adherence of CLAIM among AI research in top medical imaging journals is still inadequate. It is needed to join efforts of researchers, editors, and reviewers to strengthen the application of CLAIM to improve the research quality and reproducibility.

More from our Archive