DOI: 10.1161/circ.148.suppl_1.17074 ISSN: 0009-7322

Abstract 17074: Development and Multinational Validation of a Machine Learning-Based Optimization for Efficient Screening for Elevated Lipoprotein(a)

Arya Aminorroaya, Lovedeep S Dhingra, Seyedmohammad Saadatagah, Erica S Spatz, Evangelos K Oikonomou, Rohan Khera
  • Physiology (medical)
  • Cardiology and Cardiovascular Medicine

Background: Lipoprotein(a) [Lp(a)] is associated with an increased risk of major adverse CV events even among those receiving optimal lipid-lowering therapy. However, <1% of patients with or without ASCVD undergo Lp(a) testing. We developed a machine learning model to optimize screening for high Lp(a) using structured clinical elements.

Methods: We included 456,815 participants of UK Biobank (UKB), the largest cohort of people who underwent protocolized Lp(a) testing regardless of indication. Potential features spanned demographics, medication history, diagnoses, procedures, vital signs, and lab values from UKB and linked EHR to predict high Lp(a) (≥150 nmol/L). The dataset was split into train and test sets (80:20). We trained multiple extreme gradient boosting models using all available features and selected the best model by prioritizing high AUROC while minimizing the number of features using cross-validation. Pooled cohort equation (PCE) was used as a comparator to predict high Lp(a). We externally validated the model in well-established CV cohorts of ARIC, CARDIA, and MESA (Fig A).

Results: The best model used features of LDL-C, statin use, triglycerides, HDL-C, ASCVD, and antihypertensive medication use, and outperformed the PCE score for predicting high Lp(a) (AUROC 0.66 vs 0.52, P<0.001). The model performed comparably across ARIC, CARDIA, and MESA cohorts (Fig B). To find 1 person with high Lp(a), the model reduces the number needed to test (NNT) by up to 62% depending on the probability threshold, which can be tailored to different use cases (Fig C). The relative reduction in NNT is higher among those without vs with ASCVD (49% vs 26%, for optimized specificity with a threshold of 0.2).

Conclusions: Our model optimizes Lp(a) testing strategies using commonly available clinical features, which can be leveraged in EHR and other settings for efficient Lp(a) screening. The model can encourage greater Lp(a) testing and optimize case finding for future studies.

More from our Archive