Controlled Evaluation of Hybrid Multi-Face Recognition Pipelines for Real-Time Occluded Face Recognition on Edge Devices
Shkëmb Abdullahu, Arbana Kadriu, Marco PiangerelliAccurate recognition of partially occluded faces remains challenging in unconstrained and real-time environments, especially under masks, partial occlusions, pose variation, and illumination changes. This study presents a controlled comparison of three hybrid multi-face recognition pipelines for robust occluded face recognition. For fair evaluation, all pipelines use the same SCRFD face detector, preprocessing protocol, Linear SVM classifier, and 60% unknown rejection threshold, while varying only the feature extractor: ResNet29, ConvNeXt, and ResNet100 with ArcFace embeddings. To reduce data leakage, models are trained only on normal, non-occluded faces and tested on unseen partially occluded faces. Evaluation is performed on a custom dataset and the public Real-World Occluded Faces dataset, alongside three existing paper methods with publicly available code tested under the same experimental protocol. The SCRFD with ArcFace ResNet100 and Linear SVM pipeline achieved the best results compared to existing papers and our other pipelines, reaching 97.475% real-time accuracy for five faces and over 99% confusion-matrix-based accuracy on the custom dataset. On the ROF dataset, it also achieved closed-set accuracies of 98.66% for sunglasses and 97.92% for masks, with threshold-based accuracies of 96.35% for the sunglass test and 95.14% for the mask test. Furthermore, it obtained EER values below 0.007 and AUC values above 99%. In real-time testing, it achieved 29.25 FPS with 34.18 ms/frame latency on a GPU-enabled laptop and approximately 5 FPS with 273.4 ms/frame latency on a Raspberry Pi 4.