DOI: 10.3390/make8070190 ISSN: 2504-4990

Class-Conditional Conformal Prediction for Reliable Anomaly Detection Under Extreme Class Imbalance

Bashair Althani

Anomaly detection systems deployed in critical applications require not only high accuracy but also reliable uncertainty quantification and coverage guarantees. This paper is an empirical study—rather than a contribution of new conformal-prediction machinery—of class-conditional (Mondrian) conformal prediction for anomaly detection under extreme class imbalance, characterizing where standard conformal prediction fails and how class-conditional calibration restores valid coverage. Class-conditional conformal prediction constructs prediction sets that, under exchangeability, contain the true label with user-specified confidence (e.g., 90%), enabling systems to abstain on uncertain predictions. Unlike standard conformal prediction that fails catastrophically under extreme imbalance—achieving only 52.94% anomaly coverage at a 1:345 imbalance ratio—class-conditional calibration maintains 90.59% anomaly coverage by computing quantiles separately for each class. We apply the standard softmax-based nonconformity score s=1−fy(x) within each class, ensuring valid coverage for both normal and anomalous instances with coverage gaps ranging from 0.50% to 5.18% depending on dataset characteristics. Extensive experiments on three real-world datasets (Microsoft Azure KPI, Yahoo, NAB) demonstrate that the method achieves empirical coverage within 0.06–0.33% of theoretical targets at confidence levels α≤0.05; on the most imbalanced benchmark (Microsoft Azure KPI at a 1:345 ratio and α=0.10), this corresponds to a 37.65 percentage point improvement in anomaly coverage over standard conformal prediction. We restate finite-sample coverage bounds and exchangeability conditions in the binary anomaly detection setting and validate them empirically through Monte Carlo simulation. Multi-model evaluation across XGBoost, Random Forest, and Neural Networks demonstrates the model-agnostic property of the framework, while also identifying conditions (poor base-classifier discrimination, small minority calibration sets) under which coverage may be marginally violated. Comparison with alternative uncertainty quantification methods (isotonic probability calibration, Monte Carlo dropout) shows that only conformal prediction provides formal guarantees while maintaining 90.59% anomaly coverage versus 76.47% and 84.71% for alternatives. The abstention mechanism identifies 34–66% of predictions as uncertain at high confidence levels (99%), enabling safety-critical systems to defer difficult cases to human experts while preserving baseline discrimination (ROC-AUC unchanged).

More from our Archive