DOI: 10.1093/ejhf/xuag193.290 ISSN: 1388-9842

Mortality accuracy of artificial intelligence and machine learning models in patients with heart failure: a systematic review

C Vigil Lopez, M J Rivera, V Agundez-Garcia

Abstract

Introduction

Heart Failure (HF) is a global burden affecting millions of people across the world annually. It is a condition with a complexion in its diagnosis and treatment, even more in its prognosis and survival rate. Machine learning (ML) and Artificial intelligence (AI) are novel tools that are impacting the medical field, proving its usefulness in detecting, treating and predicting pathologies, benefiting patients and physicians alike. Currently there are many predicting scores for mortality in adult patients with HF, but manual interpretation of these methods lead to inconsistencies between different evaluators. ML and AI tools predicting mortality are being developed and highly evaluated to address this issue and give a more accurate survival range.

Objectives

Evaluate the performance accuracy of AI and ML models to assist physicians in everyday practice for the prediction of mortality in patients with HF, using data from all validation and test cohorts available in the literature.

Methods

A systematic search of PubMed, Google Scholar, Elsevier, Wiley and Nature published from January 2019 to October 2025, identifying studies evaluating the application of ML and AI models to assess the mortality in patients with HF. Only data from validations and test cohorts were considered for this research. Sensitivity and specificity were analysed and included for the different types of models used. Only articles in the English language were considered.

Results

A total of 12 studies with 66216 patients and 80 AI and ML prognostic models algorithms were included in this analysis. Our cohort of patients analysed in these algorithms was predominantly male (57.6%) with a mean age of 72.8 ± 8.0 years. All the ML and AI models had an overall accuracy of 0.782, sensitivity of 0.560 and specificity of 0.701. On the types of algorithms used, Random Forest and Gradient Boosting Tree were two of the most used algorithms across the examination and demonstrated better performance in the algorithms analysed.

Conclusions

AI and ML–based prognostic models demonstrate moderate overall performance for mortality prediction in patients with HF, with balanced sensitivity and specificity across validation and test cohorts. However, the moderate sensitivity observed highlights the need for further model study and external validation before routine clinical implementation. Future investigation studies should focus on improving model generalizability, interpretability, and integration into real-world clinical workflows.IMAGE 1. SROC curve ML and AI algorithmsFor image description, please refer to the figure legend and surrounding text.IMAGE 2. PRISMA ML and AI algorithmsFor image description, please refer to the figure legend and surrounding text.

More from our Archive