DOI: 10.3390/app16136296 ISSN: 2076-3417

Identifying Parkinson’s Disease from Gait Biomechanics Using a Participant-Level Machine Learning Analysis Pipeline

Li Jin

Parkinson’s disease (PD) is a progressive neurodegenerative disorder characterized by motor control, balance, and gait impairments that significantly elevate fall risk. Traditional gait analysis focuses on spatiotemporal parameters, while gait variability, asymmetry, and balance measures offer more sensitive indicators of PD-related motor deficits. Machine learning studies using wearable gait data frequently report high classification accuracy but lack biomechanical interpretability and methodological rigor. Using the PhysioNet Gait in Parkinson’s Disease database, 93 individuals with PD and 72 healthy controls were analyzed during level-ground walking. Key biomechanical differences were identified: stride time coefficient of variation was significantly higher in PD bilaterally (left p = 0.001; right p = 0.003); swing-phase time was significantly reduced in both limbs (left p = 0.003; right p = 0.001); anterior–posterior center of pressure (COP) variability was significantly lower in PD for both limbs (p < 0.001); and COP path symmetry index was the most prominent asymmetry marker, significantly elevated in PD relative to controls (p = 0.003). A machine-learning analysis pipeline identified HistGradientBoosting as the best-performing classifier (AUC = 0.992; accuracy = 97.6%), but leave-one-study-out evaluation exposed substantial cross-protocol heterogeneity (AUC: 0.500–1.000), indicating that the model relied partly on dataset-specific patterns and may not generalize to independent acquisition protocols. Shapley Additive Explanations (SHAP) analysis showed classification was driven by a multimodal combination of clinical severity measures and biomechanical gait features rather than wearable metrics alone. A pre-specified gait-only sensitivity analysis that excluded clinical severity variables (UPDRS, UPDRSM, Hoehn and Yahr) confirmed that biomechanical features alone retained moderate, but substantially reduced, discriminative ability (gait-only holdout AUC = 0.844), supporting the interpretation that the headline performance reflects multimodal clinical separation rather than a stand-alone wearable-gait biomarker. These findings indicate that Parkinsonian gait impairment is characterized by timing instability and constrained forward COP progression. The combination of biomechanical analysis with interpretable predictive modeling represents a structured analysis pipeline for gait-based PD assessment; however, external validation in independent cohorts and prospective testing across acquisition protocols are required before such a pipeline can be deployed as a clinically generalizable digital biomarker.

More from our Archive