DOI: 10.3390/bdcc10070201 ISSN: 2504-2289

A Comparative Study of Time Series Clustering Performance with Classification as a Benchmark

Maria Sadowska, Krzysztof Gajowniczek

This paper extends a previous classification study by examining clustering methods on the same synthetic datasets and comparing their behavior with the previously obtained classification results. This study investigates the performance of selected time series clustering methods under controlled changes in noise level and class complexity. Six clustering methods representing distance-based, feature-based, and deep learning approaches were evaluated on 82 balanced synthetic datasets. The datasets contained from two to six classes, different levels of additive Gaussian noise, 200 time series per dataset, and 1000 observations per time series. The analysis focused on clustering quality, comparative behavior with classification models, and computational cost in terms of training time and peak memory usage. Clustering quality was assessed mainly using Adjusted Rand Index and V-measure, while accuracy after Hungarian label matching was used as an auxiliary measure for comparison with classification models. The results show that distance-based methods, and particularly TimeSeriesKMedoids, achieved the most robust and consistent clustering performance across the considered settings. Clustering quality decreased with both the number of classes and the noise level, but the effect of noise was clearly stronger. Feature-based and deep learning-based clustering methods were generally more sensitive to noise, while deep models were also associated with substantially higher computational cost. In terms of memory usage, classical clustering methods remained below 50 MiB, whereas deep learning-based clustering methods required substantially more memory. This study further shows that accuracy computed after Hungarian label matching may provide an overly optimistic view of clustering quality. Accuracy after Hungarian label matching is reported only as an auxiliary metric, while the main interpretation of clustering quality is based on structure-sensitive measures such as Adjusted Rand Index and V-measure. Overall, the findings highlight the importance of robust distance-based approaches and of using structure-sensitive evaluation measures when analyzing time series clustering.

More from our Archive