DOI: 10.1093/europace/euag105.1232 ISSN: 1099-5129

Artificial intelligence-based ascertainment of sudden cardiac death in electronic health records: development, validation and utility across national datasets

B Petrazzini, N Ahmed, Z Fan, J Lian, R Do, S Rao, K Rahimi

Abstract

Background

Sudden cardiac death (SCD) accounts for ~15 to 20% of global deaths [1]. Yet, it is poorly captured in national death registries because ascertainment of SCD needs timeframe and origin of death which are often missing. This limits statistical power for research and informed health policy for SCD. Electronic health records (EHR) could enable large-scale SCD research but the latest phenotype of SCD in EHR has modest precision (positive predictive value [PPV]=55%) [2]. Transformer-based artificial intelligence (AI) models can potentially capture the timeframe and origin of death to improve ascertainment of SCD in EHR.

Purpose

To develop and validate an AI-based model for ascertainment of SCD (aSCD) in EHR and demonstrate utility across national datasets.

Methods

We trained an AI model using diagnostic and procedural codes of >7 million individuals in the Clinical Practice Research Datalink (CPRD) and fine-tuned using 1,084 definite cases of SCD and 6,142 curated controls (Figure 1). We evaluated the area under the curve (AUC) of aSCD in a test set designed to minimise label leakage and tested whether aSCD can capture suddenness and cardiac origin of death in CPRD. We then applied aSCD to retrospectively ascertain SCD matching the expected UK incidence rate. Finally, we assessed whether aSCD can inform nationwide redistribution of automated external defibrillators (AED), pharmacovigilance and biomarker discovery in CPRD, UK Biobank (UKB) and Mount Sinai Data Warehouse (MSDW).

Results

In the test set, aSCD showed AUC of 0.95 (95% confidence interval [CI] 0.90-0.97) and PPV of 0.86 (95% CI 0.83-0.89). The tool tracked signatures of suddenness of death as shown by significantly higher scores in sudden versus expected types of deaths (P<0.0001) (Figure 2A). aSCD also captures cardiac origin of death as shown by an explainability analysis dominated by cardiovascular codes, strong discrimination of cardiac versus non-cardiac causes of cardiac arrest (AUC=0.95 [95% CI 0.92-0.98]), and enrichment of SCD-related causes of death (β=0.49, P<0.0001) (Figure 2B-D). Following guidance from the ESCAPE-NET Consortium, aSCD ascertained 138,387 cases of SCD in the UK. We then evaluated three use-cases for aSCD. Analyses of regions in England revealed a mismatch between incidence of ascertained SCD and AED density, highlighting opportunities for redistribution of AED. aSCD also identified 10 out of 13 (76.9%) QT-prolonging drugs with hazard ratios ranging from 1.6 to 10.0 in a pharmacovigilance study. And finally, it tracked with known markers of SCD risk such as lower left ventricular ejection fraction (β=-4.2, P<0.0001) and longer corrected QT interval (β=14.5, P<0.0001) in a meta-analysis of UKB and MSDW.

Conclusion

An AI-based tool improves precision of SCD phenotyping in EHR by 31% (PPV of 86% for aSCD versus 55% for ESCAPE-NET). This can enable scalable SCD research in EHR with much needed applications to SCD prevention and health policy.Figure 1Figure 2

More from our Archive