Machine learning based differentiation of RVOT and LVOT ventricular arrhythmias using electrocardiographic features
M Botis, D Tsiachris, G Botis, A Koumpanakis, L I Bartsioka, A Kordalis, I Doundoulakis, C K Antoniou, A E Karanikola, G K Matsopoulos, K TsioufisAbstract
Purpose
To assess whether a machine learning model based on ECG-derived features can reliably differentiate RVOT from LVOT PVCs.
Materials and methods
The study included 47 patients with documented premature ventricular contractions (PVCs), comprising 23 (48.9%) of LVOT origin and 24 (51.1%) of RVOT origin, who underwent succesfull catheter ablation. The exact site of origin was depicted through 3-D electroanatomical mapping. The study workflow is summarized in Figure 1. Electrocardiogram (ECG) features were manually obtained by a certified cardiologist. A total of 67 features were obtained, including R- and S-wave amplitudes and widths across leads V1–V6, I–III, aVR, aVL, QRS axis and QRS width, derived from both PVC and sinus rhythm beats. Feature selection was performed using the ANOVA F-test. The dataset was divided into training (70%) and test (30%) sets following a stratified distribution to preserve class balance. A logistic regression model was trained on the training set using a five-fold cross-validation scheme to optimize the regularization strength, penalty type, and solver. After hyperparameter selection, the final model was retrained on the entire training set and evaluated on the independent test set. The performance evaluation metrics were accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC).
Results
Feature selection identified eight discriminative electrocardiographic features: V1 S-wave amplitude, V2 S-wave amplitude, V3 R-wave width, V3 S-wave amplitude, V3 S-wave width, V4 R-wave amplitude, V5 R-wave amplitude, and PVC QRS axis (inferior–superior). The optimal hyperparameters were a regularization strength of 0.1, an L2 penalty and the 'saga' solver. The corresponding model coefficients were: V3 S-wave amplitude (−0.283), PVC QRS axis (−0.282), V1 S-wave amplitude (−0.271), V2 S-wave amplitude (−0.265), V5 R-wave amplitude (+0.188), V4 R-wave amplitude (+0.186), V3 S-wave width (+0.170), and V3 R-wave width (+0.090). On the held-out test set, the model reached an accuracy of 0.930 [0.786, 1.000], precision of 0.920, recall of 0.940, F1-score of 0.930, and ROC-AUC of 0.957 [0.813, 1.000] (Figure 2).
Conclusion
A logistic regression model using a small set of carefully selected ECG features can accurately differentiate right versus left ventricular outflow tract- originating PVCs. The model achieved high performance with robust accuracy, F1-score, and ROC-AUC metrics. These findings highlight the potential of machine learning models to support non-invasive arrhythmia localization in clinical practice, informing pre-ablation decision-making.Study workflowFigure 2