DOI: 10.1177/18761364251315239 ISSN: 1876-1364

MMSAD—A multi-modal student attentiveness detection in smart education using facial features and landmarks

Ruchi Singh, Ramanujam E, Naresh Babu M

Virtual education (online education or e-learning) is a form of education where the primary mode of instruction is through digital platforms and the Internet. This approach offers flexibility and accessibility, making it attractive to many students. Many institutes also offer virtual professional courses for business and working professionals. However, ensuring the reachability of courses and evaluating students’ attentiveness presents significant challenges for educators teaching virtually. Various research works have been proposed to evaluate students’ attentiveness using facial landmarks, facial expressions, eye movements, gestures, postures, etc. However, no method has been proposed for real-time analysis and evaluation. This paper introduces a multi-modal student attentiveness detection (MMSAD) model designed to analyze and evaluate real-time class videos using two modalities: facial expressions and landmarks. Using a lightweight deep learning model, the model analyzes students’ emotions from facial expressions and identifies when a person is speaking during an online class by examining lip movements from facial landmarks. The model evaluates students’ emotions using five benchmark datasets, achieving accuracy rates of 99.05% on extended Cohn-Kanade (CK+), 87.5% on RAF-DB, 78.12% on Facial Emotion Recognition-2013 (FER-2013), 98.50% on JAFFE, and 88.01% on KDEF. The model identifies individuals speaking during the class using real-time class videos. The results from these modalities are used to predict attentiveness, categorizing students as either attentive or inattentive.

More from our Archive