[Math] linear least squares minimizing distance from points to rays – is it possible

linear algebranumerical methodsoptimization

I'm writing a tool whose purpose is to process data from a sensor that provides the true bearing to a target, and combine measurements taken at various times into an estimate of the target's position (at the time of the first measurement) and velocity. The platform bearing the sensor can move in any way while gathering data (and in fact must change velocity to provide a unique solution), but it is assumed that the target travels at a fixed velocity the entire time.

I'm currently using a linear least squares approach to find the solution that minimizes the distance between each bearing line and the point where the target would be at the time of the measurement. The problem is that the distance to a line at bearing 45 degrees is the same as the distance to a line at the reciprocal bearing 45+180 = 225 degrees. This can lead to the target being estimated to be in the opposite direction. I hoped that it wouldn't be a problem in practice, but in fact it is.

My current approach (in detail for those who are interested): for each bearing measurement, we know that $P=(P_x,P_y)$ is the position of the sensor platform, $\mathbf B$ is the normalized bearing line vector, and $t$ is the time since the first measurement. We want to solve for $A=(A_x,A_y)$ and $\mathbf V$, which are the position and velocity of the target at the time of the first measurement.

For each measurement, the target is at $A+\mathbf V t$, and we take $$ε = \operatorname{cross}(\mathbf V)\cdot (A+\mathbf V t -P)$$ (where $\operatorname{cross}\mathbf S$ produces a perpendicular vector to $\mathbf S$, returning $(-S_y,S_x)$. There's probably a name for that.) $A+\mathbf Vt-P$ represents the vector from P to the location of the target for that measurement. We'll call this vector $\mathbf T$. So the dot product gives us
$$ε = \cos\theta |\mathbf T | |\operatorname{cross}(\mathbf B)| = \cos\theta |\mathbf T|$$ ($\mathbf B$ is normalized), where θ is the angle between the target and the perpendicular of the bearing line. The cosine function is zero at 90 or 270 degrees, which is why we used the perpendicular vector, so that it would be zero at 0 or 180 degrees to the true bearing line (i.e. it would be zero when |T| coincides with the bearing line). Thus ε is the distance from the estimated target position to the bearing line, and ε is minimized when A and V are such that the estimated target position is on the bearing line.

To solve this, we take ε = cross(B) · (A+Vt-P) = -By(Ax+Vxt-Px) + Bx(Ay+Vyt-Py). Then we square it and take partial derivatives for Ax, Ay, Vx, and Vy, sum the observations, solve the normal equations… the standard least squares approach. (I can elaborate if somebody asks.)

As I said, it's considering the distance to the bearing lines, but I actually want it to consider the distance to the bearing rays. I can't think of a way to do such a thing with linear equations. (In fact, I'm unaware of how to formulate the distance from a point to a ray without using a piecewise function.) I suppose it may not be possible. However, I'm not the best at math, so I ask:

  1. Is it possible to do with a linear equation?
  2. If not, can anybody suggest an approach that would work and how to formulate the objective function for it?

Best Answer

ad 1) Because any linear function of T can be written in the form ε = N · T for some vector N, there is little hope to improve your method if we are restricted to linear least squares.

EDIT It's actually possible to formulate it as a linear least squares problem with linear restrictions, which allows to see that just computing the distance from the ray as you proposed should work fine and is easy to implement, see (*).

So better don't try to implement what I suggested under "ad 2)", because it is more complicated than necessary.

ad 2) Formulate it as a nonlinear least squares problem. Both the Gauss-Newton algorithm (GNA) and the Levenberg–Marquardt algorithm (LMA) are well suited to solve this type of problem. There are many free implementations of LMA available. It may be a good idea to generate a sufficient large number of random initial values to avoid problems with local minima or non-convergence.

I would compute the objective functions as follows: For each bearing measurement, apply a rotation matrix $R$ to the analytic expression for T such that the positive x-axis becomes the bearing ray. Then use ε = atan2(y-coord,x-coord) as measurement error whose sum of squares should be minimized.

A problem with this approach can occur when the target passes closely by the sensor, because the errors introduced by the assumption that the target travels at constant speed is not modeled well (and because atan2 has a singularity at (0,0) which was not there for your linear approach). This can be fixed by treating T for each bearing measurement as unknown, and minimizing it's deviation from the analytic expression for T. However, the random initial values should only be generated for A and V, and the initial values for T just computed with the analytic expression. (If this should really lead to convergence problems, then the initial values for T could also be randomly perturbed a bit.)

EDIT(*) When we treat T as unknown, we actually have to decide how we want to weight the errors of the measurements and the errors of our assumption against each other. In case we assume that the errors of the bearing measurements are negligible with respect to the assumption of constant velocity, we basically arrive again at the initial linear least squares formulation. It now reads $0$ = cross(B) · T and $0\leq$ B · T together with the fact that |T - (A+Vt-P)| should be minimized. This is identical to the initial formulation in case $0<$ B · T. Otherwise, we have to replace the distance from the line with the distance from the origin. (Which is equivalent to your initial idea to use the distance from the ray for the objective function.) The theory of quadratic programming has more to say about how the switching between the different cases has to be done correctly, but I hope it shouldn't be too difficult for this specific case.

EDIT 2 The squared distance from the bearing ray could be written in the form $[cross(B)\cdot T]^2 + [\max(0,-B\cdot T)]^2$. I would probably use this together with my favorite Levenberg-Marquardt implementation, because then I don't have to worry about solving the normal equations myself. If I would instead solve the "local" normal equations until the "local" normal equations no longer change, this would be equivalent to using the Gauss-Newton method. I'm just not sure whether this is guaranteed to work, that's why I pointed to the quadratic programming solution (which is guaranteed to work).

Related Question