ScionPathML: Enabling an Empirical Measurement Dataset and Benchmarks for Path-Aware Networking
Damien Rossi, Sina Keshvadi, Yogesh SharmaPath-aware networking architectures, such as SCION, give endpoints explicit visibility into multiple inter-domain paths, opening new opportunities for data-driven path selection, reliability prediction, and automated diagnosis. However, the lack of standardized, machine learning-ready datasets collected from live path-aware deployments has slowed progress in this domain. We present ScionPathML, an open-source measurement and data-standardization pipeline that abstracts the complexity of SCION’s tooling to continuously collect longitudinal performance measurements (RTT, packet loss, jitter, bandwidth, and per-hop latency) in formats directly usable by ML pipelines. Using a four-week, multi-region campaign across four vantage points on the SCIONLab testbed, we release a public dataset capturing path availability, churn, lifetimes, and end-to-end performance across concurrently available paths. To demonstrate its application, we define four reproducible benchmark tasks, including short-horizon performance forecasting, path failure prediction, anomaly detection, and multi-objective path recommendation, each accompanied by baseline models and evaluation protocols. Our results show that live SCION path performance exhibits an exploitable temporal structure, enabling accurate short-term predictions and early detection of availability drops. Together, the dataset, benchmarks, and open tooling substantially lower the barrier for ML researchers and provide a reproducible foundation for accelerating innovation in path-aware networking.