DOI: 10.1145/3822181 ISSN: 1551-6857

Computational Attention-based Modeling of Individually Perceived Quality of Compressed Images

Lohic Fotio Tiotsop, Max Geissler, Marcus Barkowsky

Achieving personalized Quality of Experience (QoE) prediction and optimization is a long-term goal in media quality assessment, and recent research has shifted from predicting Mean Opinion Scores (MOS) toward modeling and predicting the quality judgments of individual subjects. In this work, we propose a model of individual image quality assessment that explicitly accounts for two key components of human visual perception: the spatial allocation of attention and the extraction of perceptual information from different regions of the visual scene. The model describes individual quality perception as a multi-stage process in which perceptual information and computational attention are iteratively combined and updated, followed by a probabilistic decision stage. From a theoretical perspective, the proposed formulation is defined at an abstract level and does not rely on any specific neural network architecture, and it provides a framework for explaining how computational attention and perceptual information interact to produce individual quality judgments. To illustrate the practical applicability of the proposed model, we instantiate it using a Vision Transformer (ViT) architecture as one possible implementation, training one Artificial Intelligence-based Observer (AIO) per subject with observer-specific learned parameters. Experimental results show that the resulting ViT-based AIOs: i) compare favorably with state-of-the-art deep CNN-based individual models; ii) outperform state-of-the-art no-reference image quality assessment metrics when used to predict individual opinion scores; iii) tend to allocate greater attention to regions with severe quality degradation.

More from our Archive