DOI: 10.1145/3822177 ISSN: 1049-331X
LLM-Driven Collaborative Model for Untangling Commits via Explicit and Implicit Dependency Reasoning
Bo Hou, Xin Tan, Kai Zheng, Fang Liu, Yinghao Zhu, Li Zhang
Atomic commits, which address a single development concern, are a best practice in software development. In practice, however, developers often produce tangled commits that mix unrelated changes, complicating code review and maintenance. Prior untangling approaches—rule-based, feature-based, or graph-based—have made progress but typically rely on shallow signals and struggle to distinguish explicit dependencies (e.g., control/data flow) from implicit ones (e.g., semantic or conceptual relationships). In this paper, we propose
ColaUntangle
, a new collaborative consultation framework for commit untangling that models both explicit and implicit dependencies among code changes.
ColaUntangle
integrates Large Language Model (LLM)-driven agents in a multi-agent architecture: one agent specializes in explicit dependencies, another in implicit ones, and a reviewer agent synthesizes their perspectives through iterative consultation. To capture structural and contextual information, we construct Explicit and Implicit Contexts, enabling agents to reason over code relationships with both symbolic and semantic depth. We evaluate
ColaUntangle
on two widely-used datasets (1,612 C# and 14k Java tangled commits). Experimental results show that
ColaUntangle
outperforms the best-performing baseline, achieving an improvement of 44% on the C# dataset and 82% on the Java dataset. These findings highlight the potential of LLM-based collaborative frameworks for advancing automated commit untangling tasks.