Named Entity Recognition in Arabic Mental Health Using Large Language Models
Abeer Alfaifi, Hassan Alhuzali, Ashwag AlasmariNamed Entity Recognition (NER) in Arabic mental-health text is constrained by the scarcity of gold annotations and dialectal variation. Identifying medical entities (e.g., Symptom, Diagnosis, drug, Dosage) in low-resource context is crucial for clinical triage and mental health safety. To address this gap, we propose and evaluate end-to-end framework for NER in Arabic patient-generated mental health questions. Our framework leverages state-of-the-art Large Language Models (LLMs), including GPT-4o, LLaMA, and ALLaM, utilizing Persona–Template prompting strategy, including zero-shot vs. few-shot tested across the models. We applied a clinically grounded nine-entity schema to extract structured information from mental health questions. We adopt a hybrid evaluation strategy that combines LLM-as-a-Judge scoring for scalability with targeted validation on a manually curated drug subset, compensating for the limited availability of fully gold-annotated data. Results show that few-shot prompting improves reliability across models. GPT-4o achieves the strongest drug test set accuracy (F1: 0.86→0.94), LLaMA remains stable and competitive (0.89→0.88), and Arabic-focused ALLaM shows the largest relative gain (0.58→0.73). This work not only advances Arabic NLP for mental health applications but also offers a reproducible framework for low-resource NER, demonstrating the feasibility of LLM-driven medical entity extraction in critical clinical contexts.