DOI: 10.1002/sam.11654 ISSN: 1932-1864

Boosting diversity in regression ensembles

Mathias Bourel, Jairo Cugliari, Yannig Goude, Jean‐Michel Poggi
  • Computer Science Applications
  • Information Systems
  • Analysis

Abstract

Ensemble methods, such as Bagging, Boosting, or Random Forests, often enhance the prediction performance of single learners on both classification and regression tasks. In the context of regression, we propose a gradient boosting‐based algorithm incorporating a diversity term with the aim of constructing different learners that enrich the ensemble while achieving a trade‐off of some individual optimality for global enhancement. Verifying the hypotheses of Biau and Cadre's theorem (2021, Advances in contemporary statistics and econometrics—Festschrift in honour of Christine Thomas‐Agnan, Springer), we present a convergence result ensuring that the associated optimization strategy reaches the global optimum. In the experiments, we consider a variety of different base learners with increasing complexity: stumps, regression trees, Purely Random Forests, and Breiman's Random Forests. Finally, we consider simulated and benchmark datasets and a real‐world electricity demand dataset to show, by means of numerical experiments, the suitability of our procedure by examining the behavior not only of the final or the aggregated predictor but also of the whole generated sequence.

More from our Archive