VideoAlign: A Toolkit to Make Video Analysis Accessible EICS013
João Marcelo Evangelista Belo, Keyne Oei, Anna Maria FeitDespite the potential of self-supervised video alignment algorithms for advancing human-computer interaction, they remain largely inaccessible to practitioners without machine learning expertise. To bridge this gap, we introduce VideoAlign, an open-source toolkit designed to facilitate training and integration of video alignment approaches in interactive applications. VideoAlign offers guidance for training models to align videos, detecting and tagging specific events, and performing anomaly detection through an interactive system without requiring machine learning experience. In addition to implementing state-of-the-art alignment techniques, our toolkit introduces a novel Local-Alignment Contrastive (LAC) loss. Unlike global methods that compare entire video sequences, LAC aligns localized segments independently. This capability enables robust matching when video structure or timing varies, which is crucial for real-world interactive applications. We demonstrate how VideoAlign facilitates the creation of a wide variety of interactive applications through three application scenarios: a teacher-support tool for providing efficient video-feedback, a mixed reality application that tracks activity progress in real-time, and an anomaly detection tool to monitor cooking activities.