[Math] Removing noise when the signal is not smooth

data analysisreference-requestsignal processingsoft-question

Suppose we have (an interval of) a time series of measurements:

plot of raw (simulated) data

We assume it can be explained as a "simple" underlying signal overlaid by noise. I'm interested in finding a good algorithm to estimate the value of the simple signal at a given point in time — primarily for the purpose of displaying it to human users who know, more or less, how to interpret the signal but would be distracted by the noise.

For a human observer of the plot it looks very clearly like the underlying signal has a jump discontinuity at about $t=18$. But that's a problem for automatic noise removal, because the techniques I know are all predicated of a "nice" underlying signal meaning a "smooth" one. A typical anti-noise filter would be something like colvolving with a Gaussian kernel:

plot of Gaussian smoothing

which completely fails to convey that the left slope of the dip is any different from the right one. My current solution is to use a "naive" rolling average (i.e. convolution with a square kernel):

plot of square smoothing

whose benefit (aside from simplicity) is that at least the sharp bends in the signal estimate alert the viewer that something fishy is going on. But it takes quite some training for the viewer to know what fishy thing this pattern indicates. And it is still a tricky business to pinpoint when the abrupt change happened, which is sometimes important in my application.

The widths of the convolution kernels in the two examples above were chosen to give about the same smoothing of the pure noise (since I've cheated and actually constructed the sample data I'm showing as the sum of a crisp deliberate signal and some explicit noise). If we make them narrower, we can get the estimate to show that there's an abrupt change going on, but then they don't remove all of the noise:

plot of estimation with a narrower kernel

I can't be the first person ever to face this problem. Does it ring a bell for anyone? I'd appreciate ideas, pointers to literature, search terms, a conventional name for the problem, whatever.

Miscellaneous remarks:

  1. No, I cannot rigorously define what it is I want to optimize for. That would be too easy.

  2. It will be nice if the smoothed signal can show clearly that there's no significant systematic change in the signal before the jump at $t=18$.

  3. In the example I show here, the jump was much larger than the amplitude of the noise. That's not always the case in practice.

  4. I don't need real-time behavior, so it is fine that the estimate of the signal at some time depends on later samples. In fact I'd prefer to find a solution that commutes with time inversions.

  5. Basing a solution on outside knowledge about how the particular signal I'm looking at ought to behave is not an option. There are too many different measurements I want to apply it to, and often it's something we don't have a good prior expectation for.

Best Answer

I am working at a group where we study sudden drops in current (due to ion current blockades) in some particular measurement setup.

We detect these drops by calculating the moving average (and in fact, as of recently also a moving standard deviation, since the noise level varies over time in "bad" measurements) and selecting current data points whose drop is more than $5\sigma$ (ie significant drop).

Any contiguous set of these points is called an "event", and to make sure we have the right kind of events (as opposed to noise), we further select "events" by integrating the difference of the event and the moving average, looking at its duration (ie. number of data points), and its amplitude, but I suppose you won't need all of those.

What you could do is detect events like we do, and then do interpolation or smoothing only between events (piecewise).

I must admit that we probably miss (don't detect) some "actual" events because of noise issues. Looking at your example, however, this should not be a problem for you and our code would detect it without hesitation.


One more word on our research because it's so much fun:

In fact the drops in current are so small, and the noise is so big, in our case, that we first do an 8 pole Bessel filter (typically at 10kHz) before we can even see these drops. But once the events have been detected, we can go back to the unfiltered signal and continue analysis there.

For an introduction to our research, see this paper.

Related Question