DOI: 10.1097/md.0000000000034878 ISSN:

A systematic comparison of machine learning algorithms to develop and validate prediction model to predict heart failure risk in middle-aged and elderly patients with periodontitis (NHANES 2009 to 2014)

Yicheng Wang, Yuan Xiao, Yan Zhang
  • General Medicine

Periodontitis is increasingly associated with heart failure, and the goal of this study was to develop and validate a prediction model based on machine learning algorithms for the risk of heart failure in middle-aged and elderly participants with periodontitis. We analyzed data from a total of 2876 participants with a history of periodontitis from the National Health and Nutrition Examination Survey (NHANES) 2009 to 2014, with a training set of 1980 subjects with periodontitis from the NHANES 2009 to 2012 and an external validation set of 896 subjects from the NHANES 2013 to 2014. The independent risk factors for heart failure were identified using univariate and multivariate logistic regression analysis. Machine learning algorithms such as logistic regression, k-nearest neighbor, support vector machine, random forest, gradient boosting machine, and multilayer perceptron were used on the training set to construct the models. The performance of the machine learning models was evaluated using 10-fold cross-validation on the training set and receiver operating characteristic curve (ROC) analysis in the validation set. Based on the results of univariate logistic regression and multivariate logistic regression, it was found that age, race, myocardial infarction, and diabetes mellitus status were independent predictors of the risk of heart failure in participants with periodontitis. Six machine learning models, including logistic regression, K-nearest neighbor, support vector machine, random forest, gradient boosting machine, and multilayer perceptron, were built on the training set, respectively. The area under the ROC for the 6 models was obtained using 10-fold cross-validation with values of 0 848, 0.936, 0.859, 0.889, 0.927, and 0.666, respectively. The areas under the ROC on the external validation set were 0.854, 0.949, 0.647, 0.933, 0.855, and 0.74, respectively. K-nearest neighbor model got the best prediction performance across all models. Out of 6 machine learning models, the K-nearest neighbor algorithm model performed the best. The prediction model offers early, individualized diagnosis and treatment plans and assists in identifying the risk of heart failure occurrence in middle-aged and elderly patients with periodontitis.

More from our Archive