Data‐Efficient and Explainable Ensemble Learning for
HVAC
Fan Power Modeling in Green Building Operations
Jiarui Yang, Bin Xia ABSTRACT
Accurate modeling of HVAC energy demand is essential for improving operational efficiency in green buildings. However, newly commissioned building systems often suffer from limited historical data, which restricts the reliability of conventional data‐driven prediction models. Addressing this challenge requires learning strategies capable of achieving robust performance under small‐sample conditions while maintaining model transparency. This study develops a data‐efficient modeling framework for predicting fan power consumption in air‐handling units. The proposed approach employs an uncertainty‐based active learning strategy to iteratively select informative samples and integrates ensemble regression models, including Random Forest and XGBoost, for fan power prediction. SHAP‐based explainable analysis is further incorporated to identify the key drivers of energy demand and enhance model interpretability. A real operational dataset collected from an office building energy center in Italy, containing 33,888 observations, is used for evaluation. Results demonstrate that the proposed framework achieves strong predictive performance using only a small fraction of labeled data while maintaining robust predictive stability. The optimized ensemble model achieves a test coefficient of determination ( R 2 ) of 0.971 while maintaining stable generalization across training and testing datasets. Explainability analysis further reveals that thermal variables dominate the energy consumption mechanism.