DOI: 10.1002/cpe.70844 ISSN: 1532-0626

Tiered Scheduling and Cost‐Benefit‐Aware Speculation for Optimizing Spark in Heterogeneous Clusters

Weipeng Du, Hang Yuan, Yufei Ren, Junyang Yu, Xiaojin Ren

ABSTRACT

In heterogeneous clusters, Apache Spark's task scheduling strategy does not account for node performance variations, which can lead to workload imbalance, while its straggler detection mechanism lacks heterogeneity awareness, potentially causing inaccurate straggler detection and unnecessary speculative execution. To address these issues, we propose Heterogeneity‐Aware Tiered Scheduling (HATS), which classifies nodes into performance tiers and assigns tasks to appropriate tiers according to their computational demands while respecting data locality. For speculative execution, we propose Cost‐Benefit‐Aware Speculative Execution (CBASE), which maintains node‐adaptive thresholds for straggler detection, uses an execution time model to prioritize stragglers that are most likely to delay job completion, and applies cost‐benefit analysis to determine whether speculative tasks should be launched. We implement both strategies in Spark 3.4.2 and evaluate them using representative benchmarks. Experimental results show that HATS reduces job completion time by 16.1%–25.1% and improves average CPU and memory utilization by 11.6% and 13.1%, respectively, compared with Spark's default scheduling strategy. CBASE reduces job completion time by 18.4%–22.3% and improves cluster throughput by 22.5%–28.8% compared with Spark's default speculative execution strategy.

More from our Archive