DOI: 10.47000/tjmcs.1836055 ISSN: 2148-1830

Hybrid Model for DGA Detection in Cybersecurity: Fast Text, CNN, BiLSTM, and Multihead Attention

Abdhine Ben Ali, Sinan Toklu, Fahadi Mugigayi, Djalabi Mahamat
Domain Generation Algorithms (DGAs) are a crucial threat in modern cybersecurity; they enablemalware to generate dynamically pseudorandom domain names for command-and-control (C&C) communication to evade blacklisting detection mechanisms. Many concealing techniques are used to address network botnets; thus, DGAs continue to pose detection challenges requiring more robust approaches. This work presents a comprehensive and effective hybrid deep learning model integrating FastText embeddings, Convolutional Neural Networks (CNNs), Bidirectional Long Short-Term Memory (BiLSTM) networks, and Multihead Attention mechanisms for strong DGA detection. Our methodology uses FastText’s subword-level semantic representations to capture morphological patterns, CNNs for the local feature extraction, BiLSTM for bidirectional context and sequential dependency modeling, and attention mechanisms to identify prominent sequence dependencies. We evaluate the hybrid model on a large-scale dataset comprising the University of Murcia Domain Generation Algorithm (UMUDGA) with 500.000 datasets, in addition to 500.000 datasets from 360 Netlab, combined with Alexa top 1 million legitimate domain names, resulting in 99.49% accuracy, 99.75% precision, 99.23% recall, 99.49% F1-score, and 99.97% ROC-AUC. Comparative analysis with baseline methods demonstrates competitive and improved performance under the evaluated settings, showing strong generalization capability of the proposed hybrid architecture in detecting with efficiency both known and previously unseen DGA families. Our model addresses the limitations of traditional machine learning and deep learning approaches; it provides a scalable solution potentially suitable for real-time DGA detection in cybersecurity applications.

More from our Archive