DOI: 10.1145/3699736 ISSN: 2474-9567

GOAT: A Generalized Cross-Dataset Activity Recognition Framework with Natural Language Supervision

Shenghuan Miao, Ling Chen

Wearable human activity recognition faces challenges in cross-dataset generalization due to variations in device configurations and activity types across datasets. We present GOAT, a Generalized crOss-dataset Activity recogniTion framework that leverages learning with natural language supervision to address these challenges. GOAT utilizes textual attributes from activity labels and device on-body positions to enable multimodal pre-training, aligning wearable activity representations with corresponding textual representations. This approach enables GOAT to adapt to diverse device configurations and activity label spaces in downstream tasks. Our method incorporates a novel device position encoding technique, a Transformer-based activity encoder, and a cosine similarity loss function to enhance feature extraction and generalization capabilities. Extensive evaluations demonstrate GOAT's effectiveness across various scenarios, including comparisons with state-of-the-art baselines, component analysis, and zero-shot activity recognition. GOAT shows promise for advancing cross-dataset activity recognition, offering a flexible and scalable solution for diverse wearable sensing applications.

More from our Archive