DOI: 10.3390/atmos17060624 ISSN: 2073-4433

A Hybrid Statistical-Machine Learning Framework for Risk-Based Screening of High-Frequency Carbon Emission Data Under Emissions Trading Systems

Changyi Weng, Zhenghua Shu, Jueying Qian, Jingwei Fan, Xiaohu Luo

Reliable carbon emission data are essential for the effective operation of emissions trading systems (ETS), especially as China’s ETS expands to include energy-intensive industries. This study proposes a hybrid, risk-based anomaly detection framework for high-frequency CO2 emission data by cross-validating material-based emissions with flue gas-based monitoring data. Under normal operating conditions, the ratio of material-based to flue gas-based emissions is expected to remain within a relatively stable distribution. Potential high-risk periods can therefore be identified when this relationship is distorted or when local temporal patterns deviate from expected behavior. The framework combines Hartigan’s dip test with a window-based Random Forest (RF) classifier, which is suitable for continuous monitoring data that may exhibit temporal dependence. The framework was evaluated using 15-min CO2 emission data from a cement production facility, with simulations of anomaly magnitude, duration, and mode. Results show that the dip test performs well for long-lasting or strong anomalies, whereas the RF model is more sensitive to subtle, short-term deviations. In the integrated framework, 94.7% of anomalous periods were detected by at least one method and flagged as potential data-quality risks, whereas normal periods were not flagged, supporting its use to prioritize verification efforts.

More from our Archive