fMRI-Based Prediction of Eye Gaze During Naturalistic Movie Viewing Reveals Eye-Movement–Related Brain Activity
Le Gao, Zhi Wei, Bharat B Biswal, Xin DiAbstract
Background
Eye gaze provides crucial insights into perceptual and cognitive processes during naturalistic movie viewing, yet concurrent eye tracking is often unavailable in functional MRI (fMRI) research. While deep learning models can estimate gaze directly from fMRI eyeball signals, their out-of-the-box generalizability across heterogeneous datasets requires empirical evaluation.
Methods
We applied a specific pre-trained model from the DeepMReye framework in a zero-shot setting (without dataset-specific fine-tuning) to estimate gaze during movie watching across three independent fMRI datasets. Model accuracy was evaluated against camera-based eye-tracking data and via inter-subject correlations. Furthermore, we derived eye-movement-related time series from the predicted gaze signals to map their associated brain activation.
Results
At the individual level, predicted gaze showed modest correspondence with measured ground-truth data (r ≈ 0.24–0.37), yielding brain activation maps largely restricted to the visual cortex. In contrast, group-averaged gaze predictions exhibited substantially higher reliability (r ≈ 0.73–0.84). First-level General Linear Models (GLMs) derived from group-averaged predictions successfully revealed widespread activation across established oculomotor control regions, including the frontal and parietal eye fields. Exploratory analyses of age-related effects on gaze prediction and brain activity yielded inconsistent results across datasets.
Conclusions
Under a zero-shot implementation, the pre-trained model exhibits limitations for individual-level inference, likely reflecting the absence of dataset-specific training. However, group-averaged fMRI-based gaze estimates successfully capture shared viewing behaviors and robustly support the investigation of eye-movement-related brain activity. These findings inform the appropriate use of fMRI-based gaze decoding for naturalistic neuroimaging datasets lacking ground-truth eye-tracking logs.