Solved – How to cluster time series

clusteringspsstime series

I have a question about cluster analysis. There are 3000 companies, which have to be clustered according to their power usage over 5 years. Each company has values for every hour during 5 years. I would like to find out if some companies have the same pattern in usage power over the time period. The results should be used for daily prediction of power usage. If you have some ideas how to cluster time series in SPSS, please share with me.

Best Answer

A) Spend a lot of time on preprocessing the data. Preprocessing is 90% of your job.

B) Choose an appropriate similarity measure for the time series. For example, threshold crossing distance may be a good choice here. You probably won't desire dynamic time warping distance, unless you have different time zones. Threshold crossing may be more appropriate to detect temporal patterns, while not paying attention the the actual magnitude (which will likely be very different from company to company).

C) Cluster the resulting dissimlarity matrix using methods such as hierarchical clustering or DBSCAN that can work with arbitrary distance functions.