DOI: 10.1161/circ.148.suppl_1.13588 ISSN: 0009-7322

Abstract 13588: A Generalizable Deep Learning System for Cardiac MRI

Rohan Shad, Cyril R Zakka, Dhamanpreet Kaur, JOHN MONGAN, Kimberly G Kallianos, Ross Filice, Nishith Khandwala, David Eng, Curtis Langlotz, William Hiesinger
  • Physiology (medical)
  • Cardiology and Cardiovascular Medicine

Introduction: Our goal was to create a generalizable Cardiac MRI (CMR) deep learning system capable of understanding the breadth of human disease and health. With traditional supervised approaches, parameters learned for one clinical problem rarely generalize to others. Models must be re-trained from scratch for each new clinical task of interest, requiring thousands of training examples every time. Unlike human clinicians, deep learning models lack a baseline fund of clinical knowledge over which learning specific tasks can be accelerated.

Methods: We use self-supervised contrastive learning to train a video transformer system that learns complex pathophysiological visual representations guided by natural language supervision from text reports accompanying each CMRI study. Two parallel deep learning systems - one processing text, and the other processing CMR cine sequences learn from each other, allowing the vision system to learn from radiologist reports as training progresses. Training was performed on a multicenter clinical dataset of 17000 patients. We subsequently freeze the vision network, and apply it to the task of calculating left ventricular ejection fraction (LVEF%) from the UK BioBank (45,6323 participants)

Results: On a hold-out test subset of UK BioBank participants, we report a mean absolute error (mae) of 3.344 (sd 3.615), with Bland Altman limits of agreement -9.91% to +9.61%. Using this network to screen for HFrEF (Heart Failure with reduced ejection fraction < 40%), the AUC is 0.880 (95%CI: 0.835-0.925; n=4259). This is significantly more performant compared to our baseline, where we finetune the exact same network architecture using traditional supervised approaches (AUC 0.751 (95% CI: 0.692-0.811; p < 0.001)). Finally, our network achieves superior results with 1/100 th the data.

Conclusions: We show that contrastive text-video pre-training can be used to train powerful and generalizable deep learning systems for CMR.

More from our Archive