Unsupervised machine learning stratifies AF patients into phenotypes linked to post-PVI recurrence
S Ganesh, I Vatsaraj, L Steffens, H Horlitz, F Stoeckigt, N Trayanova, Y MohsenAbstract
Background
Recurrence after PVI remains common, and it is unclear how comorbidities and demographics interact to shape recurrence risk. Leveraging advances in machine learning, we asked whether model-derived patient embeddings could uncover latent phenotypes linked to recurrence.
Objective
To investigate the impact of clinical and demographic patient data on atrial Fibrillation (AF) recurrence (R) outcomes using unsupervised machine learning, dimensionality reduction, clustering, and feature importance.
Methods
We analyzed 621 patients with 96 variables (demographics, comorbidities, medications, outcomes). We trained an XGBoost classifier without recurrence in the features, computed per-patient SHAP values, and clustered patients on their SHAP profiles (k-means). Cluster count selection combined silhouette, normalized Calinski-Harabasz, and 1−normalized Davies-Bouldin metrics and was qualitatively checked with UMAP. An auxiliary XGBoost model (trained without recurrence labels) predicted cluster membership; cluster-specific SHAP beeswarms summarized feature contributions. Mean recurrence was then computed per cluster.
Results
Overall, 267/621 (43%) patients recurred (mean age 66.9±11.0 years; 59.6% male). Four SHAP-defined clusters emerged.
Clusters: C1: n=328; recurrence 4%. C2: n=69; recurrence 84%. C3: n=93; recurrence 86%. C4: n=131; R: 89%.
C1 — lowest risk (n = 328, R: 4%):
Inclusion was favored by younger age (↓ age) and higher symptom burden (↑ EHRA class). Additional SHAP signals included lower CRP, lower fluoroscopy dose, shorter EP-lab duration, higher hemoglobin, and preserved renal function (↑ GFR), with TSH and BMI contributory but secondary.
C2 — high-risk, metabolic profile (n = 69, R: 84%):
Inclusion was driven primarily by ↑ CRP and ↑ LDH, with contributions from thyroid function (↓ TSH), higher body mass/weight, electrolytes (↓ Na), hemoglobin, GFR, platelets (Thromb), and EHRA—a pattern consistent with systemic inflammation and metabolic stress.
C3 — high-risk, ischemic/renal/procedural load (n = 93, R: 86%):
Longer EP-lab duration (↑ EPU_duration) strongly favored membership, together with prior myocardial infarction (MI present), higher adiposity (↑ weight/BMI), renal impairment (↑ creatinine), greater radiation exposure (↑ fluoroscopy dose/time), non-paroxysmal AF type, and ↑ CRP.
C4 — high-risk, mixed profile (n = 131, R: 89%):
Top contributors included TSH and CRP (both tending higher among included patients), with additional signals from EHRA, reduced EF (↓ EF), ↑ weight, ↑ LDH, hemoglobin, and MCV.
Conclusion
Unsupervised, SHAP-guided phenotyping uncovered latent, clinically interpretable AF profiles and cleanly separated low- from high-recurrence groups—patterns not apparent from raw features or standard summaries. This demonstrates how transparent unsupervised ML can surface multivariate, non-linear interactions to enable risk stratification, generate testable hypotheses, and personalize peri-procedural management.