X-OCTANE: eXplainable On-Chip Telemetry-based Anomaly Notification Engine for Silicon Lifecycle Management
Eduardo Ortega, Hsiao-Ping Ni, Arjun Hati, Jonti Talukdar, Woohyun Paik, Fei Su, Rita Chattopadhyay, Krishnendu Chakrabarty
Silicon Lifecycle Management (SLM) is essential for ensuring the reliability, security, and performance of modern computing platforms. Conventional SLM approaches rely on off-chip or software-only analytics, which lack on-die interpretability. We present X-OCTANE, an explainable on-chip telemetry-based anomaly notification engine. X–OCTANE is a hybrid explainability framework that integrates Causal-Preserving Mutual Information (CP–MI) with SHAP-based feature attribution for interpretable feature ranking and selection. The proposed feature rank-and-selection procedure retains competitive safety-oriented anomaly detection and bolsters security-specific in-field monitoring of compute platforms. For the 7 nm ASAP technology, X-OCTANE achieves competitive anomaly detection performance with less than 0.50% CPU die area and 1.0% non-idle power overhead. In addition, X–OCTANE improves anomaly-driven diagnosis accuracy by over