Solved – Time series and anomaly detection

clusteringtime seriestrend

I would like to setup up an algorithm for detecting an anomaly in time series, and I plan to use clustering for that.

  • Why should I use a distance matrix for clustering and not the raw time series data?,

  • For the detection of the anomaly, I will use density-based clustering, an algorithm as DBscan, so would that work in this case? Is there an online version for streaming data?

  • I would like to detect the anomaly before it happens, so , would using a trend detection algorithm (ARIMA) be a good choice?

Best Answer

Regarding your first question, I would recommend that you read this famous article (Clustering of Time Series Subsequences is Meaningless) before doing clustering on a time series. It is clearly written and illustrates many pitfalls that you want to avoid.