DOI: 10.1093/europace/euag105.292 ISSN: 1099-5129

Machine learning model for early risk stratification of atrial fibrillation patients post-COVID-19

O V Stasyshena, O S Sychov, B V Batsak, O V Sribna, T V Getman, T V Mikhalieva, A N Solovyan, Y U V Zinchenko

Abstract

Introduction

Patients with pre-existing atrial fibrillation (AF) may experience arrhythmia progression after COVID-19 infection. However, it remains unclear which patients are at the highest risk.

Purpose

To develop a machine learning model for early risk stratification of AF patients post-COVID-19, predicting progression of AF form or increased arrhythmia burden versus a stable course.

Methods

We retrospectively analyzed 80 patients with established AF (mean age 64 years; 50% male). Common comorbidities included hypertension (82%), diabetes (12%), and heart failure (6%). Demographic, clinical, echocardiographic, laboratory, and 24-hour Holter monitoring data were collected for each patient. Missing numerical values were imputed using medians, and missing categorical entries were filled with "NA." We trained a CatBoostClassifier (chosen for its robust handling of mixed data types and small samples) on these features (70/30 train/test split, without class stratification due to imbalanced outcomes) to predict three outcomes: AF form worsening (progression from paroxysmal to persistent; Group 2), worsening clinical burden (increased AF episode frequency (Group 3a), or stable disease course (Group 3b).

Results

The model achieved an accuracy of approximately 79% (macro F1-score ≈ 0.79) in the test set. The classification performance was balanced across the three outcome classes. Top predictive features included left atrial size, N-terminal pro-B-type natriuretic peptide (NT-proBNP), C-reactive protein (CRP), CHA2DS2-VA score, duration of AF prior to COVID-19 infection, heart rate variability indices, and other inflammatory markers (e.g., D-dimer). Patients predisposed to AF progression post-COVID tended to have larger left atria and higher levels of cardiac and inflammatory biomarkers.

Conclusion

A CatBoost-based machine learning model can effectively stratify post-COVID AF patients according to their risk of arrhythmia progression. This tool could inform early interventions for high-risk individuals and supports the "two-hit" hypothesis in AF, wherein an underlying arrhythmogenic substrate combined with the inflammatory stress of COVID-19 precipitates AF worsening.

More from our Archive