Solved – Anomaly detection in multivariate time series data

anomaly detectiondbscantime series

I am trying to solve an anomaly detection problem that consists of three variables captured over a span of five years. It is an unsupervised problem, and I believe density-based clustering methods like DBSCAN aren't a good fit for this problem as it doesn't consider seasonality, time series nature of the variables. Can someone suggest appropriate methods or any implementation that is similar to this problem?

Best Answer

An other way to perform anomaly detection in an unsupervised manner, taking into account seasonality, is by mean of neural networks such as LSTM or auto encoders (see this recent paper).

Basically, the idea is to measure abnormality by mean of reconstruction error. Your network will be trained on a sequence of (supposedly) legitimate events, and for a new datapoint, will try to reconstruct it. Because the function your network has been learning is tailored for the normal datapoints, a reconstruction error (could) denote an anomaly.

ANN such as LSTM are designed to not update their internal state for every packet going through the network, thus retaining the impact of older data points for a longer period of time.