Solved – How to apply Kalman filter to one dimensional data

kalman filter

I asked a question on StackOverflow for which I was suggested to use Kalman Filter. The question is as follows:

https://stackoverflow.com/questions/5726358/what-class-of-algorithms-reduce-margin-of-error-in-continuous-stream-of-input/5728373#5728373

A machine is taking measurements and
giving me discrete numbers
continuously like so:

1 2 5 7 8 10 11 12 13 14 18

Let us say these measurements can be
off by 2 points and a measurement is
generated every 5 seconds. I want to
ignore the measurements that may
potentially be same

Like continuous 2 and 3 could be same
because margin of error is 2 so how do
I partition the data such that I get
only distinct measurements but I would
also want to handle the situation in
which the measurements are
continuously increasing like so:

1 2 3 4 5 6 7 8 9 10

In this case if we keep ignoring the
consecutive numbers with difference of
less than 2 then we might lose actual
measurements.

Now how do I apply Kalman Filter to solve this? All examples I see take multiple error estimations while I know a single thing that each measurement can be off by a value Q thats it and all the examples also work on multi dimensional vectors, that too multiple vectors.

Best Answer

No Model

Wayne points out that you do not say anything about the process generating the data. In particular, you don't say whether the quantity you are tracking, the 'state' is fixed or moving. If it's fixed - the case Wayne considers - then you may as well keep a running average of all the observations and hope for the best because that's all a full-blown state space model with state estimated by KF would do for you anyway. If it's moving then you need to ask yourself how it moves. Is it a random walk? Is it constantly increasing? Does it have cyclical or other recurring structure? You need those assumptions to define the state space model for which the KF supplies state estimates.

Off by 2

When you say 'let us say that measurements can be off by 2 points' you may think you are making things easier, either to explain or implement, but you aren't. If we take the 'off by 2' idea literally, then Kalman filtering cannot do what you want (although I suppose it might possibly approximate it). This is because the KF assumes your observations are conditionally Normally distributed. Your measurement error assumption would instead be Uniform. This will lead to incorrect inferences about the state if you apply the KF directly.

Thinking About the Problem Statistically

You ask 'how do I partition the data such that I get only distinct measurements'. That's a good question if all your data are either good measurements or bad ones. However, when considering KF we assume rather that all measurements have some error, about which we have a small theory - the state space model - that contains a sub-model of this error and another sub-model of the evolution of a underlying state that generates the measurements. The KF makes inferences on the basis of this theory. Consequently, in this framework you don't privilege any measurements but rather look to the estimated state for your answers.

Suggestion

If you don't feel like or simply cannot specify as much detail as necessary for a complete state space model to which you can apply a KF, it might be better to back off to more 'empirical' (and easier to implement) methods based on smoothing and weighted averages of recent data. Exponential smoothing might be a helpful place to start, e.g. as described fairly clearly here: http://www.duke.edu/~rnau/411avg.htm This approach has quite close connections to KF approaches, so you could return to them easily if necessary.

Related Question