Wavelets – Application of Wavelets to Time-Series-Based Anomaly Detection Algorithms

outlierssignal processingtime serieswavelet

I've been beginning to work my way through Statistical Data Mining Tutorials by Andrew Moore (highly recommended for anyone else first venturing into this field). I started by reading this extremely interesting PDF entitled "Introductory overview of time-series-based anomaly detection algorithms" in which Moore traces through many of the techniques used in the creation of an algorithm to detect disease outbreaks. Halfway through the slides, on page 27, he lists a number of other "state of the art methods" used to detect outbreaks. The first one listed is wavelets. Wikipeida describes a wavelet as

a wave-like oscillation with an
amplitude that starts out at zero,
increases, and then decreases back to
zero. It can typically be visualized
as a "brief oscillation"

but does not describe their application to statistics and my Google searches yield highly academic papers that assume a knowledge of how wavelets relate to statistics or full books on the subject.

I would like a basic understanding of how wavelets are applied to time-series anomaly detection, much in the way Moore illustrates the other techniques in his tutorial. Can someone provide an explanation of how detection methods using wavelets work or a link to an understandable article on the matter?

Best Answer

Wavelets are useful to detect singularities in a signal (see for example the paper here (see figure 3 for an illustration) and the references mentioned in this paper. I guess singularities can sometimes be an anomaly?

The idea here is that the Continuous wavelet transform (CWT) has maxima lines that propagates along frequencies, i.e. the longer the line is, the higher is the singularity. See Figure 3 in the paper to see what I mean! note that there is free Matlab code related to that paper, it should be here.


Additionally, I can give you some heuristics detailing why the DISCRETE (preceding example is about the continuous one) wavelet transform (DWT) is interesting for a statistician (excuse non-exhaustivity) :

  • There is a wide class of (realistic (Besov space)) signals that are transformed into a sparse sequence by the wavelet transform. (compression property)
  • A wide class of (quasi-stationary) processes that are transformed into a sequence with almost uncorrelated features (decorrelation property)
  • Wavelet coefficients contain information that is localized in time and in frequency (at different scales). (multi-scale property)
  • Wavelet coefficients of a signal concentrate on its singularities.