TLR: Codebase-Level C Memory Management Error Repair with Large Language Models

doi:10.1145/3797085

DOI: 10.1145/3797085 ISSN: 2994-970X

TLR: Codebase-Level C Memory Management Error Repair with Large Language Models

Xiao Cheng, Zhihao Guo, Huan Huo, Yulei Sui

Memory management errors in C remain a leading source of software vulnerabilities due to the inherent complexity of manual memory handling. Traditional Automated Program Repair (APR) largely relies on rule- or template-based techniques, which require expert-crafted specifications and often struggle to generalize. Recently, Large Language Models (LLMs) have emerged as a complementary approach, leveraging broad exposure to codebases and programming idioms to synthesize fixes that can extend beyond existing templates and rules. This paper introduces TLR, a novel framework that augments LLM-based repair with typestate-guided context retrieval. By using a finite typestate automaton to track error-propagation paths and memory state transitions, our approach provides the LLM with focused, semantically rich context for codebase-level memory error repair, effectively addressing both interprocedural reasoning and LLM context window limitations. Our framework has successfully repaired 37 out of 49 real-world memory errors derived from 14 open-source projects that collectively comprise approximately 1.57 million lines of code. Compared to state-of-the-art memory error APR tools, SAVER and ProvenFix, our approach correctly fixes 14.50× and 2.36× more errors, respectively; and on the double-free and use-after-free subset, TLR repairs all 7 cases whereas the crash-constraint-driven CrashRepair repairs only 1. Moreover, TLR outperforms current open-source state-of-the-art LLM-based repair tools, repairing more errors than SWE-agent 1.0 and the tree-of-thought agent Sand2Patch, while introducing far fewer harmful patches. We have also successfully repaired three critical zero-day memory errors, with fixes that have been accepted and implemented by the original developers. These results highlight a promising paradigm for codebase-level program repair through program analysis-guided, retrieval-augmented LLMs, combining formal verification strengths with neural model adaptability.

Outline

TLR: Codebase-Level C Memory Management Error Repair with Large Language Models

More from our Archive