I have a set of almost 1600 time series on 2 years which I want to group into clusters. Do you think this is possible using k-means? Which method do you advice me to use? Is this possible at all using SPSS?
Solved – Clustering of time series
classificationclusteringdata miningspsstime series
Best Answer
k-means cannot use arbitrary distance functions. It is designed for Euclidean distance.
Euclidean distance however does not work well for high-dimensional data such as your time series (unless you have a really low sampling rate, say 24 months)
For time series, you will probably want to use a time series distance. There are quire a lot designed specifically for different kinds of time series. You really should look at these.
They won't work with k-means, but there are various distance and density-based cluster algorithms (where usually density is defined by distance!) that you should try. However, I have no idea what SPSS supports. I don't know if it has any time series distances, either.