DOI: 10.1063/5.0304928 ISSN: 2688-4089

vioHAR: A visual-inertial odometry-based human activity recognition system

Kazi Md. Shahiduzzaman, Md Salah Uddin Yusuf, Md Sajjad Hossen, Md Nazmul Alam Munna

Human activity recognition (HAR) remains a pivotal research domain with extensive applications in healthcare, elderly care, and assistive technology. However, traditional systems relying solely on inertial measurement units (IMUs) face significant limitations, particularly regarding precision and vulnerability to varying device orientations. To address this urgent need, this study introduces visual inertial odometry based human activity recognition (vioHAR), a novel methodology and public dataset that seamlessly fuses visual and inertial data. By leveraging the visual-inertial odometry system for monocular cameras (VINS-Mono) state estimator, the proposed system generates precise, orientation-independent trajectories, facilitating robust classification across eight distinct daily behaviors and fall scenarios. Extensive evaluation using leave-one-subject-out (LOSO) cross-validation demonstrates that the vioHAR method achieves a robust average accuracy of 73.96%. This represents a substantial improvement over the standard IMU-only baseline (43.35%), yielding specific accuracy gains of +27.60% for the random forest (RF) model and +41.11% for the long short-term memory (LSTM) model. Furthermore, comparing computational efficiency, the LSTM network trained nearly ten times faster than the RF model (44.79 vs 415.81 s) while maintaining a minimal false alarm rate of 0.038, highlighting its superior potential for edge computing applications. Despite hardware challenges such as asynchronous sensor clocks, vioHAR establishes a rigorous and efficient foundation for real-world HAR deployment on emerging head-worn devices.

More from our Archive