DOI: 10.3390/app16136538 ISSN: 2076-3417

HARMONI: A Two-Stage Hybrid Learning Framework with Dynamic Metric Learning for Interpretable NIDS

Rongxin Hu, Zhiqiang Zhang, Minhao Li, Youwen Wen, Le Wang

The increasing sophistication of cyber attacks has created a growing demand for effective Network Intrusion Detection Systems (NIDSs). Although deep learning has improved NIDS performance, existing models often lack adaptive mechanisms for spatiotemporal feature fusion and struggle with complex traffic distributions characterized by severe intra-class heterogeneity and inter-class overlap. Meanwhile, current interpretability methods mainly rely on feature importance analysis and provide limited insight into the model’s decision process. To address these challenges, we propose HARMONI, a two-stage hybrid learning framework that enhances both detection accuracy and model interpretability. In the first stage, a dual-branch Convolutional Neural Network and Gated Recurrent Unit (CNN-GRU) architecture extracts spatiotemporal features, which are dynamically fused through a lightweight adaptive gating network. The representation learning process is jointly optimized using a Dynamic Class-Center Loss to enforce intra-class compactness and inter-class separability in the latent space. In the second stage, the learned deep representations are concatenated with raw traffic features and fed into an ensemble classifier. This residual-style design mitigates information loss during deep encoding while leveraging the non-linear modeling capability of ensemble learning. We further develop a multi-level interpretability framework based on SHapley Additive exPlanations (SHAP) that analyzes global feature importance, individual feature contributions, and feature interactions to provide quantitative insights into the model’s decision mechanisms. Experiments on four benchmark datasets show that HARMONI consistently outperforms state-of-the-art baselines, achieving 80.19% and 78.24% accuracy on NSL-KDD and UNSW-NB15 respectively, surpassing representative deep learning and ensemble methods.

More from our Archive