SenScanner: An Artificial Intelligence-Based Automatic Password-Related Secret Detection System in Mixed Texts
Zhuofeng He, Yumeng Guo, Bo Zhang, Wenzhi CaoThe rapid expansion of the Internet has enabled large-scale information sharing while also increasing the risk of sensitive information leakage. Authentication secrets, including passwords and API keys, may be unintentionally exposed in publicly accessible environments such as web pages, network packets, and code-sharing platforms when they are mishandled by developers or operators. Such leakage allows attackers to abuse third-party authentication services and may lead to unauthorized access, fraud, or broader compromise. Therefore, timely and accurate detection of sensitive information in network data is essential for reducing security risks. This paper presents SenScanner, an artificial intelligence-based model for automatically identifying password-related secrets in mixed text. By combining natural language processing and machine learning techniques, SenScanner detects leaked password-related sensitive information across heterogeneous textual contexts. Experimental results on 2000 public data samples show that SenScanner achieves superior precision, recall, and F1-score, demonstrating its effectiveness in reducing false positives and manual review effort.