Debugging Engine Enhanced by Prior Knowledge: Can We Teach LLM How to Debug?
Kunyi Li, Sai Wu, Xiu Tang, Chang Yao, Songhao Bu, Quanqing Xu, Gang ChenAutomated Program Repair (APR) powered by Large Language Models (LLMs) has shown strong potential for improving software reliability. However, existing LLM-based APR approaches underutilize the rich debugging knowledge latent in large-scale bug-fix corpora. Prior work has primarily advanced APR through multi-agent coordination, prompt engineering, task decomposition, or model training. While effective to some extent, these methods rely heavily on the implicit reasoning capacity of LLMs, without explicitly modeling the debugging knowledge that underlies successful program repair. We present DeepK, a novel framework that systematically extracts, validates, and reuses debugging knowledge to guide LLMs in APR. Rather than treating historical bug-fix pairs solely as contextual exemplars, DeepK distills them into a structured knowledge base of verified debugging knowledge. This knowledge base can be seamlessly integrated into diverse APR pipelines, providing interpretable reasoning traces and step-wise repair strategies that enhance the repair performance.
We evaluate DeepK on multiple benchmarks (ACPR, Atcoder) using both GPT-4o and DeepSeek-v3. Results show that DeepK consistently surpasses state-of-the-art APR systems in repair accuracy. Ablation studies confirm the importance of edit description generation, multi-perspective retrieval, and two-fold debugging knowledge. Varying the number of retrieved entries further highlights the trade-off between informativeness and noise. These findings emphasize the essential contribution of explicit debugging knowledge in advancing LLM-based APR and establish DeepK as an extensible framework for building more reliable repair systems.