Discovery-Driven Plasma Proteomics Identifies a Multi-Protein Signature for Amyloid PET Positivity: A Machine Learning Analysis of the Bio-Hermes Cohort
Stelios Lamprou, Kalliopi Mavromati, Frank J. Gunn-Moore, Terry J. QuinnAlzheimer’s disease is a progressive neurodegenerative disorder in which early detection remains limited by the cost and invasiveness of positron emission tomography and cerebrospinal fluid testing. We evaluated whether plasma proteomic profiles could distinguish amyloid PET-positive from amyloid PET-negative individuals using the Bio-Hermes cohort. After quality control and missing-data filtering, 988 participants and 295 proteins were analysed; 31 proteins showing group differences were used for supervised classification. Random Forest, Gradient Boosting, and Neural Network models were trained across four train/test splits with repeated cross-validation and class downsampling. Amyloid-positive and amyloid-negative groups differed across a subset of proteins, with five remaining significant after false discovery rate correction. Tree-based models performed most consistently, with Random Forest and Gradient Boosting achieving AUC values of 0.79–0.81 and balanced accuracy of 0.68–0.73. Eight proteins (SERPINA1, C3, CRP, APOE4, CFH, VTN, C1QTNF5, and PON1) emerged as recurring high-importance features. These findings indicate that discovery-driven plasma proteomics can identify multi-protein signatures associated with amyloid status and can complement established single-analyte blood biomarkers by adding pathway-level information.