Solved – Trend and Breakout detection in time series

anomaly detectionchange pointrtime seriestrend

I am working on several types of system metrics which characterizes several components of an application. The metrics range from system metrics like cpu.utilization to network metrics and database metrics like bytes.out/bytes.in and response-time for apache and haproxy.

The assumption of a normal distribution doesn't seem to hold for these metrics because of dependence on the load the distribution skewed for almost constant load. Also, seasonality too might be present in few of the metrics.

The objective is to find out if there is a change in the trend in long term or if there was a breakout in the time series of these metrics at a given instant in real time.

What are the best approaches to come up with a generic breakout system for detection or do we need different approaches depending on nature of these metrics?

For breakout detection I am thinking of using t-test to check if some current window has significant change in means compared to some previous window or long term value of mean.

Any guidance on the approaches will be very helpful.

Update: Adding link to a few data sets.

mysql.bytes_received

haproxy.requests

Best Answer

There are several solutions to your problem. There are two forms of outliers:

  1. Additive outlier (also called as pulses)
  2. Level shifts (also called as break in trend).

I'm assuming you would need step 2 what you call as breakout detection. There are variety of methods and tools that could help you in this:

Open Source Software:

  1. Change point package in R software
  2. Break point package
  3. Robust Anomaly Detection from Netflix
  4. Breakout detection by twitter

there are two commercial version, that I have worked with great success: 1. SAS using UCM and ARIMA frame works 2. SPSS time series outlier detection

It is beyond the scope of one answer to mention pros and cons of these methodologies. I must say RAD from Netflix and Breakout detection from twitter performs worse in your data. What this tells you in my opinion that Statisticians have developed elegant methods like the one in changepoint package that is able to easily detect breakpoints in your data. I have also had excellent success using SAS/SPSS.

Below are some of the results from applying all the 4 open source packages. Twitters breakout is the worst which does not recognize any breaks in your data. Netflix's RAD does point out all your additive outliers/pulses but fails to recognize level shift around data point ~1351.Both changepoint and breakpoint detects correctly level shifts in ~1351 and 1353 respectively. I'll expand my answer in the future. Let us know if this is what you are looking for.

library("breakpoint")
library("changepoint")
library("RAD")
library("ggplot2")

## Get Data

tsdata <- read.csv("mysql.bytes_received.csv", header = TRUE, sep = ",")
tsdata.value <- tsdata[,2]


## Use Breakpoint

bp.tsdata <- breakpoints(tsdata.value ~ 1)
bp.tsdata
breakpoints(bp.tsdata)


## Use Changepoint

ansmean=cpt.mean(tsdata.value)
ansmean
plot(ansmean,cpt.col='blue')


## USE RAD from Netflix

ggplot_AnomalyDetection.rpca(AnomalyDetection.rpca(as.numeric(tsdata.value),frequency = 1)) + ggplot2::theme_grey(base_size = 25)


## USE Breakout from Twitter
res = breakout(tsdata.value, method='multi', plot=TRUE)
res
res$plot

output from changepoint and breakpoint:

> ansmean
Class 'cpt' : Changepoint Object
       ~~   : S4 class containing 12 slots with names
              date version data.set cpttype method test.stat pen.type pen.value minseglen cpts ncpts.max param.est 

Created on  : Wed Mar 30 01:39:30 2016 

summary(.)  :
----------
Created Using changepoint version 2.2.1 
Changepoint type      : Change in mean 
Method of analysis    : AMOC 
Test Statistic  : Normal 
Type of penalty       : MBIC with value, 25.04703 
Minimum Segment Length : 1 
Maximum no. of cpts   : 1 
Changepoint Locations : 1353 
> bp.tsdata

     Optimal 3-segment partition: 

Call:
breakpoints.formula(formula = tsdata.value ~ 1)

Breakpoints at observation number:
1351 2769 

Corresponding to breakdates:
0.3196876 0.6552295 

Output from RAD (NEtflix) and Breakout detection (Twitter), both fail to recognize breakouts:

enter image description here

Twitter's Breakout detection:

> res
$loc
integer(0)

$time
[1] 7.951

$pval
[1] NA

$plot
Related Question