Structure-Aware Delta Debugging with Geometric-Information Weights
Yonggang Tao, Jingling XueDelta debugging is a fundamental technique for automatically minimizing failure-inducing inputs. ProbDD improves ddmin via probability-guided search, and Weighted Delta Debugging (WDD) further incorporates token-based weighting to mitigate size disparities. However, token counts measure textual volume rather than structural complexity, potentially misrepresenting reduction difficulty. We propose Structure-Aware Delta Debugging (SADD), which models structural complexity using geometric and information-theoretic properties of the syntax tree. SADD defines a unified weight function that integrates geometric volume, decision uniformity, and effective branching complexity. We instantiate this model in two variants: SA ddmin , which performs structure-aware partitioning within ddmin, and SA ProbDD , which injects structural weights into ProbDD's gain function, leaving the underlying search control unchanged. We evaluate SADD against ddmin, ProbDD, and WDD (instantiated as W ddmin and W ProbDD ) within two state-of-the-art frameworks, HDD and Perses, on 62 real-world C and XML benchmarks. Under HDD, SADD reduces average debugging time by up to 57.12% over ddmin and 15.04% over W ddmin on C, and by 30.12% over ddmin on XML; improvements over ProbDD reach 14.64% on C and 20.10% on XML. Gains are smaller under Perses or on structurally simple inputs, where token-based weighting already suffices. An ablation study shows that geometric volume forms the foundation of improvement, while information-theoretic metrics provide complementary refinements. Overall, explicitly modeling structural complexity improves delta debugging when input hierarchies exhibit meaningful structural diversity.