DOI: 10.1161/circ.148.suppl_1.18776 ISSN: 0009-7322

Abstract 18776: ECG-GPT: Automated Complete Diagnosis Generation From ECG Images Using Novel Vision-Text Transformer Model

Akshay Khunte, Veer Sangha, Gregory Holste, Lovedeep S Dhingra, Arya Aminorroaya, Zhangyang Wang, Rohan Khera
  • Physiology (medical)
  • Cardiology and Cardiovascular Medicine

Background: Artificial intelligence can automate the diagnosis of a multitude of rhythm and conduction disorders from ECGs, but is limited by a dependency on discretely defined labels. Moreover, most models also frequently rely on signal data, limiting their application in low-resource settings, where such models are needed most and ECGs are available in image form.

Methods: 1.4 million 12-lead ECGs and accompanying cardiologist diagnosis statements from Yale were used to develop a model capable of generating ECG report statements. This encompassed a wide array of conditions. To develop a model capable of report generation across ECG layouts, 12-lead signal recordings were plotted randomly in multiple formats. The gold-standard diagnosis statements were processed using to remove identifiable information and standardize diagnosis reporting. An ECG-specific vision encoder-decoder model with a pretrained BEiT transformer encoder with a 384x384 pixel input size and a generative pretrained transformer, GPT-2, as a decoder was developed (B).

Results: In a held-out test set of 100,000 ECGs, representing ECG interpretation with up to 255 tokens and over 41 distinct cardiovascular conditions, the model performed well across conventional metrics, with ROUGE-L and BLEU-1 scores of 0.695 and 0.502, respectively (C). The model’s clinical validity was assessed by computing agreement between cardiologist and model-generated diagnosis statements. The model generated text reports with accurate identification of diverse set of rhythm and conduction disorders, with AUROCs for predicting atrial fibrillation, right bundle branch block, and sinus tachycardia of 0.95, 0.94, and 0.96, respectively (D).

Conclusions: Our vision encoder-decoder model generates an ECG report from ECG images in any layout. This approach represents an automated and accessible strategy for generating cardiologist-level reporting on ECGs, useful for low-resource settings.

More from our Archive