TUSR: A Test Unit-Based Framework for Repairing Obsolete GUI Test Scripts
Shaoheng Cao, Minxue Pan, Xuandong LiGraphical User Interface (GUI) testing is a primary technique for testing mobile applications. Among existing testing methods, script-based testing is widely adopted because test scripts can be replayed across devices and versions. However, as mobile apps evolve, frequent changes to GUI appearance and interaction logic often render scripts written for earlier versions obsolete. Existing repair approaches typically follow a stepwise framework, repairing scripts action by action. While effective for minor GUI appearance changes, they struggle when interaction logic is modified. In this paper, we present TUSR, a test unit–based repair framework that fundamentally departs from the stepwise paradigm. TUSR introduces the concept of a "Test Unit"—a contiguous sequence of test actions serving a single testing intention—and repairs scripts at the unit level. It first splits scripts into test units using a multi-modal LLM with a rule-based correction mechanism to ensure consistency, then conducts dynamic repair guided by a Chain-of-Thought prompt enhanced with reflective memory. Experiments on 20 real-world Android applications, covering 262 test scripts and 3,485 actions, demonstrate that TUSR significantly outperforms state-of-the-art black-box and white-box repair approaches.