Backdoor Watermarking for Continuous Prompt Learning Models in Industrial Systems
Kongyang Chen, Chuwen Pang, Xiaolin Wang, Tiancai Liang, Jiaxing ShenLarge Language Models (LLMs) are becoming key enablers in adaptive and autonomous systems, particularly under the paradigm of Industry 5.0, where human-centric design and generative Artificial Intelligence (AI) technologies are increasingly deployed. However, the widespread of LLMs raises serious intellectual property concerns, especially in few-shot learning scenarios where model customization is achieved through continuous prompt tuning. Traditional watermarking methods fail to protect such models due to their limited data access and fixed model parameters. To address this challenge, we propose a novel backdoor-based watermarking framework tailored for continuous prompt learning in few-shot settings. Our method leverages semantic-aware trigger generation and adaptive trigger assignment to embed robust and invisible behavioral watermarks without compromising model performance. Specifically, we introduce a semantic alignment mechanism to generate and filter watermark triggers. We also present an adaptive strategy to assign optimal triggers based on decision boundary proximity. Experimental results across various NLP tasks demonstrate high watermark detection accuracy, minimal impact on model utility, and resilience to adversarial removal. This work contributes a practical and effective approach to copyright protection for generative AI models in Industry 5.0 systems.