DOI: 10.1093/europace/euag105.1226 ISSN: 1099-5129

Making wrist photoplethysmography reliable for clinical activity quantification: validation of a novel machine learning artefact removal procedure

P Vermunicht, C Buyck, S Naessens, W Hens, E Van Craenenbroeck, J S Piedrahita Giraldo, K Makayed, S Herman, K Laukens, J Roeykens, K De Deckere, L Desteghe, H Heidbuchel

Abstract

Background/Introduction

Accurate heart rate monitoring supports objective quantification of physical activity intensity, a key determinant for cardiovascular prevention and rehabilitation. Wrist photoplethysmography (PPG) enables accessible near continuous monitoring, but motion artefacts distort clinical interpretability. We optimised and prospectively validated the first machine learning artefact removal procedure (ARP) that operates directly on heart rate values exported from consumer wearables, with the aim to improve preservation of activity information in real world conditions.

Purpose

To validate ARP performance in daily life and exercise conditions, and to test clinical utility through agreement with the Antwerp Activity Index (AAI), an individualised heart rate based physical activity score.

Methods

In this prospective study, 149 participants (46 following cardiac rehabilitation, 103 healthy individuals) wore a PPG-based wrist-device and a reference chest strap for 12 weeks. Prior testing defined participants as PPG-compatible (mean absolute error <10% during ≥70% of training data). Three quarters were used for training machine learning models for artefact and activity detection; one quarter served as test population. To balance artefact removal and activity preservation, multiple probability thresholds for each were combined to reject unreliable PPG datapoints. The AAI was calculated from the resulting PPG and reference data to evaluate the optimal threshold combination.

Results

Seventy-eight participants (53.8%) were PPG-compatible, with 75% (n=58; 4,144,654 datapoints) used for training and 25% (n=20; 992,180 datapoints) for testing. The artefact model detected artefacts with a sensitivity of 58.5% in daily life and 76.6% during exercise, while limiting incorrect removals (74.9-91.3% specificity). The activity model reached 74.4% sensitivity and 84.4% specificity in daily life, and 88.4% sensitivity with 40.1% specificity during exercise. The optimal combination rule, unreliability if artefact probability >50% and activity probability <70%, produced the best AAI agreement. Pearson correlation between PPG-derived and reference AAI was r=0.96 in daily life and r=0.89 during exercise, with mean absolute error 5.2 and 6.7 AAI points (Figure 1). During exercise, the proportion of large AAI discrepancies >20 points was infrequent (5.7%), while clinically relevant complete activity misses or false detections were even rarer (1.4%).

Conclusion(s)

The machine learning ARP improves the reliability of 24 hour wrist PPG by removing artefacts while preserving activity-relevant signals. The approach delivers strong agreement with a clinically interpretable activity score and is suitable for integration into remote monitoring workflows. Future trials will prospectively evaluate its potential to improve physical activity feedback, support adherence, and contribute to improved clinical outcomes.Figure 1

More from our Archive