Addressing radiologist inter-reader variability in gastric cancer T staging: A prospective study using the dual-module AI model, TRACE.

doi:10.1200/jco.2026.44.19_suppl.102

DOI: 10.1200/jco.2026.44.19_suppl.102 ISSN: 0732-183X

Addressing radiologist inter-reader variability in gastric cancer T staging: A prospective study using the dual-module AI model, TRACE.

Chun Yang, Jing Zhang, Yan Zhao, Xiaomiao Chai, Zishuo Yan, Jia Wei, Guoliang Zheng

102

Background: Gastric cancer, a leading global malignancy with high incidence and mortality, poses a significant public health burden in China. Accurate preoperative T Staging, assessing tumor invasion depth via contrast-enhanced computed tomography (CE-CT), is crucial for treatment planning but suffers from inter-observer variability, particularly in T4a versus T4b discrimination. To enhance diagnostic precision and reproducibility, we introduce TRACE (Two-component Radiology-guided Autonomous Cascade Engine), a dual-module deep learning model for automated, nuanced gastric cancer T-staging. Methods: This multicenter study included both retrospective and prospective cohorts. We developed the TRACE model featuring a dynamic cascaded architecture simulating radiologists' stepwise diagnostic process: first, a hybrid CNN and hierarchical Transformer model performed initial T1-T4 screening; if classified as T4, a second specialized ResNet-152 model was automatically triggered for T4a/T4b subclassification. A total of 3,874 patients from three medical centers were enrolled. The two sub-models were trained using 1,652 (T1-T4) and 567 (T4a/T4b) samples from one center, and validated on two retrospective external cohorts (EVC1: n=850; EVC2: n=550) and one prospective cohort (PVC: n=822). Using postoperative pathology as the gold standard, we evaluated discriminative performance (accuracy, macro-averaged AUC), calibration (Brier score), and clinical utility (comparison with radiologists). Results: TRACE demonstrated excellent and robust performance across multiple independent validation cohorts. For the five-class T staging task, the model achieved a macro-averaged AUC of 0.964 (95% CI: 0.951–0.976) with 94.4% accuracy in EVC1, and a macro-averaged AUC of 0.949 with 82.4% accuracy in the prospective cohort. The T4 subclassification model attained accuracies ranging from 86.7% to 96.2% in external validations. Importantly, TRACE assistance significantly improved radiologists' diagnostic performance, increasing average sensitivity from 40.0% to 75.2% and overall accuracy from 41.2% to 79.6%. Grad-CAM analysis revealed that model attention regions aligned with clinically relevant anatomical landmarks. Conclusions: This study successfully developed and multi-center validated TRACE, the first AI system enabling end-to-end prediction covering all five gastric cancer T stages (T1, T2, T3, T4a, T4b). The model demonstrated high accuracy, excellent generalizability, and significant clinical utility, effectively addressing the subjectivity and inconsistency in current CT interpretation. TRACE provides a reliable automated tool for precise preoperative staging and individualized treatment decision-making in gastric cancer.

Outline

Addressing radiologist inter-reader variability in gastric cancer T staging: A prospective study using the dual-module AI model, TRACE.

More from our Archive