On the Robust Random Forest Model with Expectile Learning for Multilevel Classification of Obesity Risk
Wisnowan Hendy Saputra, Sabrina Julietta ArisantyAccurate obesity risk classification is often hindered by the asymmetric and heteroscedastic nature of health data, where traditional mean-based machine learning models fail to capture critical distribution tails. This study addresses this gap by proposing a robust Expectile Random Forest (ERF) model, a novel ensemble architecture that integrates an expectile learning framework via the Asymmetric Least Squares (ALS) loss function for seven-level (multilevel) classification. Utilizing a dataset of 2111 empirical records, the sensitivity analysis identifies τ=0.7 as the optimal configuration, achieving an overall Accuracy of 94.6 ± 0.7% and a Macro F1-Score of 94.5 ± 0.7%. This performance represents a significant quantitative improvement over state-of-the-art benchmarks, outperforming XGBoost by 1.8% and standard Random Forest by 3.9%. Feature importance analysis identifies body weight, age, and sedentary factors as primary predictors, while the ERF model demonstrates exceptional ordinal consistency and robustness against clinical outliers. These findings provide a superior methodological framework for developing precise medical decision support systems, shifting the paradigm from central-tendency predictions to tail-sensitive health risk mapping.