Solved – predicting x,y position using machine learning

machine learningregression

I am trying to predict the position (x,y) of a robot on a 2D plane or even 3D(UAV).

I have multiple sensors which might have noise (most probably they do), and i have 16 features, which are sensor values.

I would like to infer the position x,y of the robot given these sensor values

I also have a set of data which contains the x,y position for the robot with their corresponding sensor values, so i am treating this as a supervised learning approach.

Can anyone suggest a machine learning technique to approach such a problem. I can already do "good" using a sequential monte carlo (particle filter), but am curious to see what machine learning approach could be applied here.

Should it be tackled as a "time series" problem, if so then what kind of algorithms would preform well with this multi-label prediction (2 output values x and y)
I also believe it can not be a classification problem because then i would just have too many classes (based on the space where that the robot is moving in) and would then have to reduce classes and reduce resolution of position

Also, it would be really good if anyone has any expertise on this and could explain if it should be treated as a regression or classification problem explaining why

I guess this type of problem could be called "multi-target regression"

Best Answer

My advice, in short, would be to try a Kalman filter.

The longer version is this. To restate your problem, at every time step $t$ you have some noisy sensory estimates of robot position $(\hat{x_t},\hat{y_t})$, and you want to infer the robot's true position $(x_t,y_t)$.

Given only the data from a single time step, I don't think there's much you can do with this data. Unless there is some consistent bias in the sensory estimates, your best guess of the robot's position given the current sensory data is simply $(\hat{x_t},\hat{y_t})$. However, your robot's position presumably is highly correlated from one time step to the next, so you could use this information to your advantage. To put it probabilistically, you can make use of the relationship $p(x_t,y_t|x_{t-1},y_{t-1})$ in the following manner: $$\int p(x_t,y_t|\hat{x}_t,\hat{y}_t,\hat{x}_{t-1},\hat{y}_{t-1})\propto $$ $$ p(\hat{x_t},\hat{y_t}|x_t,y_t)p(x_t,y_t|x_{t-1},y_{t-1})p(x_{t-1}, y_{t-1}|\hat{x}_{t-1:t-N},\hat{y}_{t-1:t-N})dx_{t-1}dy_{t-1} $$

Let's break this down. In words, the idea is that your sensory information from all previous time points provides information about the robot's position at time $t-1$. This information is quantified by the distribution $p(x_{t-1}, y_{t-1}|\hat{x}_{t-1:t-N},\hat{y}_{t-1:t-N})$. Now the robot's position is correlated over time, and this link is described by $p(x_t,y_t|x_{t-1},y_{t-1})$ (i.e. given that the robot was at this location before, where is it likely to have moved to?). Thus, the rightmost part of the equation means that you look at every possible location that the robot could have been in before given your history of sensory data, and this gives you a prediction of all the positions (and their probabilities) that the robot could be now, based only on this historical data.

In other words, the history of sensory data constrains the range of positions where the robot could be right now. Finally, you update this belief by the information gained by observing the sensory data at the current time.

Note that this expression can be computed recursively, as the distrbution of information gained from the sensory history up to $t-1$ can be decomposed into terms depending on $t-1$, and then a term for the history up to $t-2$, so that you get a formula that is equivalent to the one above. Thus, in practice what you would do is start with the first two time points, compute the left hand side of the equation, and then continue with the next time point. The inference at each time point thus depends only on the sensory data at that time, and the running estimate of the information based on sensory history up to $t-1$. (In other words, the problem can effectively be cast as a Markov chain.)

Where does machine learning come into this? Well, you need to know two things: (1) a transition function that gives you $p(x_t,y_t|x_{t-1},y_{t-1})$, i.e. the way a robot can change its location from one time point to the next, and (2) a generative model $p(\hat{x_t},\hat{y_t}|x_t,y_t)$, i.e. a function that describes the probability of observing a certain sensory position reading given the robot's true position. Both functions may be known to you by construction (e.g. you may know the transition function if you know how the robot is programmed to behave, and you may know the generative model of your sensory readings from the manufacturer's specifications). If this information is not known a priori, however, you'd have to learn it from a set of training data. This is not necessarily something that has a plug-and-play solution, however; you'd have to look at the data and consider what you know about the problem, and then figure out how best to model it.

"P.S.": I wrote all this and then found this question which might be more concise and to the point for your needs. But what the heck, I'll just leave this here as an explanation of the assumptions behind a Kalman filter.