DOI: 10.3390/app16136478 ISSN: 2076-3417

A Hybrid Feature-Enhanced IndoBERT Framework with Controlled Semi-Supervised Learning for Low-Resource Indonesian Hate Speech Detection

Shoffan Saifullah, Rafał Dreżewski

Low-resource hate speech detection remains a challenging task for Indonesian social media due to limited labeled annotations, highly informal linguistic expressions, and substantial lexical variability. Under such conditions, purely supervised transformer models often suffer from unstable semantic generalization, while conventional pseudo-labeling methods are vulnerable to noisy unlabeled sample propagation. To address these limitations, this study proposes a hybrid feature-enhanced IndoBERT framework integrated with a controlled semi-supervised learning strategy. The proposed model combines contextual IndoBERT embeddings with abusive lexicon cues, handcrafted linguistic indicators, and TF-IDF–SVD statistical representations through a lightweight concatenation–projection feature fusion mechanism, while unlabeled data are incorporated via adaptive confidence thresholding and class-balanced pseudo-label selection to improve pseudo-label reliability. Extensive experiments were conducted under realistic low-resource supervision settings using only 5%, 10%, and 20% labeled data, and the proposed framework was systematically compared against representative baselines, including sparse lexical machine learning models, shallow neural architectures, multilingual transformers, IndoBERTweet, naive pseudo-labeling, and LLM-based prompting. The results show that model effectiveness is strongly supervision-dependent. Under the most extreme low-resource setting, compact statistical augmentation provides the most stable complementary signal, whereas under moderate low-resource supervision, the full hybrid representation combined with controlled semi-supervised learning yields the strongest and most consistent gains. The proposed Hybrid IndoBERT + controlled SSL framework outperforms all baselines at the 20% labeled setting, reaching an accuracy of 0.8654, Macro-F1 of 0.8633, and ROC-AUC of 0.9334. Additional analyses of pseudo-label reliability, calibration behavior, computational efficiency, and qualitative error patterns further show that the proposed framework improves low-resource robustness while maintaining comparable inference-time efficiency. These findings demonstrate that low-resource hate speech detection benefits most from the staged integration of contextual semantic modeling, interpretable linguistic cues, global lexical–statistical structure, and carefully regulated unlabeled data exploitation. Additional experiments using GPT-4o-mini and Llama-3.1-8B further demonstrate that the proposed framework remains competitive against general-purpose large language model prompting approaches under low-resource Indonesian hate speech detection scenarios. The proposed framework provides a practical and reproducible direction for hate speech detection in annotation-constrained social media environments.

More from our Archive