Empathy-Driven Arabic Conversational Chatbot Using a Pre-Trained Transformer Model
Sarah Masoud Alyami, Nasser A. Alsadhan, Mohamed Maher Ben IsmailRecent advancements in sequence generation models have transformed the development of conversational chatbots, enabling more dynamic and emotionally aware interactions. While English-language chatbots have achieved notable progress through large language models (LLMs), Arabic-language systems continue to face significant challenges, particularly in handling dialectal variation, morphological complexity, and generating emotionally aligned responses. This paper introduces two innovative approaches to enhance empathetic response generation in Arabic conversational AI. The first, Emotion-Driven Response Generation (EDRG), employs a two-stage pipeline: it first classifies user emotions using marBERT and then routes inputs to the most suitable Arabic LLM (AraBERT, AraELECTRA, AraGPT-2, or MT5) for contextually appropriate response generation. The second, EmoLlama, is a Retrieval-Augmented Generation (RAG)-based framework that integrates a curated knowledge base with the LLaMA model to retrieve relevant conversational contexts before generating semantically rich and empathetic responses. To support these approaches, a large-scale open-domain Arabic dataset was curated, containing over 600,000 dialogue entries spanning empathetic and neutral responses across seven Ekman-based emotion categories. Experimental evaluations using BLEU, Perplexity (PPL), and Cosine Similarity metrics validated the effectiveness of our models. EDRG achieved strong BLEU scores across multiple emotions, reflecting high lexical alignment, while also attaining a Cosine Similarity of 0.51. In contrast, EmoLlama significantly outperformed in semantic similarity, achieving a Cosine Similarity of 0.91, demonstrating its superior ability to generate contextually and semantically rich responses. These results highlight the complementarity of lexical and semantic metrics in evaluating emotionally intelligent Arabic dialogue systems.