[Math] How to solve an overdetermined linear system given equations with different uncertainties

least squareslinear algebrasystems of equations

Please, I would like some help to solve the following problem:

I have an overdetemined system of linear equation and want to minimize overall error. Up to now, not a problem, I could use least squares. The problem is that I know that some equations in my system are more uncertain, while others are exact. Actually, I have a number of equations with different confidence levels ("low confidence","medium confidence", "high confidence" and so on). In a AX=B system, the solution should take this into account and keep unchanged the B coefficients of the "high confidence" equations, while the B coefficients of "low confidence" equations could be changed more drastically than the B coefficients of "mid confidence" equations.

I am thinking about using some kind of gradient descent with a weighted error calculation, but, before, I would like to know if there is a better/more formal/more efficient way to solve this.

Thanks in advance

Bernardo Aflalo

Best Answer

Following up on the comment by user141267: an efficient way to give more or less weight to equations in an overdetermined system is to rescale them; that is, multiply both $A$ and $b$ by a diagonal matrix $W$ on the left. Here is an example: $$\begin{cases} x+ y & =10 \\ x+2y &= 14 \\ x+3y &= 40\end{cases}$$ To solve this system using least squares, I used lsq(A,b) in Scilab with $$A = \begin{pmatrix} 1 & 1 \\ 1 & 2 \\ 1 & 3 \end{pmatrix}, \quad b = \begin{pmatrix} 10 \\ 14 \\ 30 \end{pmatrix}$$ and got $x=-2 $, $y=10$. The right hand side for this solution is $(8, 18, 28)^T$.

But suppose the first equation is very important / certain, while the last one is the least important. If I let $W$ be the diagonal matrix with entries $(4,1,1/2)$ and run lsq(W*A,W*b), the result is $x= 2.84$, $y=7.07$. The right hand side for this solution is $(9.91, 16.98, 24.05)^T$. So, the first equation is satisfied almost exactly, while the last equation is pretty far from target.

Related Question