Prediction of Appropriate Implantable Cardioverter-Defibrillator Therapy Using Machine Learning and Routinely Available Clinical Data
Toshinori Chiba, Serafina Wegner, Emanuel Heil, Verena Tscholl, Robert Haettasch, Patrick Nagel, Johannes Lucas, Nikolaos Dagres, Felix Balzer, Alexander Meyer, Wilhelm Haverkamp, Florian Blaschke, Gerhard Hindricks, Felix HohendannerAbstract
Background and Aims
Risk stratification for appropriate implantable cardioverter-defibrillator (ICD) therapy remains imprecise when based on conventional clinical variables alone. We aimed to develop and geographically validate a machine-learning model that integrates routinely available clinical, electrocardiogram, and device interrogation/programming parameters, and quantify the incremental value of non-sustained ventricular tachycardia (NSVT).
Methods
We retrospectively analysed 514 ICD recipients implanted between 2020 and 2025 at two hospital sites (development cohort) and an independent cohort of 220 patients from a third site (external validation). The endpoint was appropriate ICD therapy (anti-tachycardia pacing or shock). Models were trained using nested stratified cross-validation in the development cohort, and the final refitted model was applied to the external cohort without recalibration.
Results
In the development cohort, 77/514 patients (15%) experienced appropriate ICD therapy (follow-up 404 days). A base model (Histogram-based gradient boosting) excluding NSVT achieved modest discrimination (ROC-AUC 0.624; average precision 0.198). Adding NSVT improved performance (ROC-AUC 0.805; average precision 0.432). Logistic regression with NSVT achieved comparable discrimination (ROC-AUC 0.78; average precision 0.43), indicating a largely NSVT-driven gain. External validation (event rate 51/220, 23%) confirmed good discrimination (ROC-AUC 0.815; average precision 0.583) with acceptable calibration (Brier score 0.137). At prespecified threshold 0.16 in external validation, sensitivity was 0.78 and specificity 0.74.
Conclusion
A machine-learning model integrating routinely available clinical, electrocardiogram, and ICD programming/interrogation data may enable prediction of appropriate ICD therapy with preserved performance on geographic external validation. NSVT was the dominant contributor to predictive performance across modelling approaches.