DOI: 10.3390/mti10070073 ISSN: 2414-4088

Evaluating In-Vehicle Multimodal Interaction via Multimodal Behavioral Signals: A Theory-Driven Tool Chain and Sim-to-Real Pilot Study

Xinyi Li, Gang Guo, Qihang Sun, Yingzhang Wu, Wenbo Li

Multitasking is pervasive in multimodal interaction, particularly within safety-critical domains like driving. Evaluating the impact of In-Vehicle Multimodal Interaction (IVMI) on drivers is critical, yet existing methods predominantly rely on post hoc subjective surveys or coarse unimodal monitoring. Grounded in Multiple Resource Theory and following a Research through Design methodology, we operationalized this theory into a non-intrusive tool chain that evaluates IVMI impact from multimodal behavioral signals (visual, touch, and driving) and supports real-time, objective evaluation in both simulated and real-world domains. To mitigate the Sim-to-Real gap, the method combines real-world multimodal data acquisition with a modality-decoupled cross-domain calibration. Its feasibility was evaluated through a simulator study (n=27) and a small-nscale real-world on-road pilot study (n=3). The results suggest that the tool chain effectively acquires high-fidelity data to support the previously developed evaluation model (Quadratic Weighted Kappa = 0.916) and achieves a preliminary calibration of cross-domain latent feature spaces. As its reference labels are behaviorally derived and share a common basis with the model inputs, this agreement indicates internal consistency rather than independent construct validation. Crucially, while multimodal interaction behaviors (visual and touch) exhibited relatively high cross-domain consistency, real-world driving behaviors showed systematic magnitude suppression. This finding is tentatively interpreted, as a hypothesis to be tested in future work, through the lens of Risk Homeostasis Theory, and highlights the necessity of monitoring multimodal interaction behaviors rather than relying solely on vehicle telemetry. Overall, this research develops and provides preliminary feasibility evidence for a theory-driven cross-domain tool chain, indicating its potential to objectively quantify multimodal interaction impacts in real-world multitasking contexts. Given the small, homogeneous on-road sample, these pilot-stage results should be read as feasibility evidence and a methodological basis for future large-scale, demographically diverse validation.

More from our Archive