Linear system of equations with an equality constraint on some of the unknowns

constraintslinear algebrasystems of equations

Background

I have a system of equations which can be written in matrix form as
\begin{align}
\mathbf{A}\mathbf{x} = \mathbf{0},\label{eq:system}\tag{1}
\end{align}

where $\mathbf{A}$ is invertible and has size $N \times N$, and the values of some of the entries of $\mathbf{x}$ (let's call them $x_i$) have to satisfy
\begin{align}
x_i = x_j – c_iy_i – d,\label{eq:constraint}\tag{2}
\end{align}

where $x_j$ is some other entry of $\mathbf{x}$. Also, $y_i\in\mathbf{y}$ which is another vector of length $M < N$, and is known. (in reality it is unknown, and I have to add in another set of equations, but I'm simplifying the situation here).
Equation (\ref{eq:constraint}) can be written in matrix form as
\begin{align}
\mathbf{P}_i\mathbf{x} = \mathbf{P}_j\mathbf{x} – \mathbf{C}\mathbf{y} – \mathbf{d},\label{eq:matconstraint}\tag{3}
\end{align}

where $\mathbf{P}_i$ "selects" the $x_i$ entries of $\mathbf{x}$, $\mathbf{P}_j$ "selects" the $x_j$ entries of $\mathbf{x}$, and $\mathbf{C}$ "selects" the $y_i$ entries of $\mathbf{y}$. Assuming there are $M < N$ such constraints, then the size of $\mathbf{P}_i$ and $\mathbf{P}_j$ is $M \times N$, the size of $\mathbf{C}$ is $M \times M$, and $\mathbf{d}$ has length $M$, as does $\mathbf{y}$.

The goal is to solve (\ref{eq:system}) subject to (\ref{eq:matconstraint}).
In particular, I want to end up with one linear system of equations, solvable with standard linear techniques, e.g. LU factorization, GMRES.

My Approach

To satisfy the constraints, I decided to delete the columns of (\ref{eq:system}) which correspond to $x_i$, and correspondingly delete those $x_i$ from $\mathbf{x}$. I plan to do this using an "incidence" matrix $\mathbf{Q}$ with ones and zeros, to get
\begin{align}
\mathbf{A}\mathbf{Q}^T\mathbf{Q}\mathbf{x},
\end{align}

where $^T$ means transpose.
Now I need to to add in the contributions of my "known" $x_i$, which I can do as
\begin{align}
\mathbf{A}\mathbf{Q}^T\mathbf{Q}\mathbf{x} + \mathbf{A}\mathbf{P}_i\mathbf{x} = \mathbf{0},
\end{align}

where $\mathbf{P}_i$ "selects" the $x_i$ entries of $\mathbf{x}$.
Finally, subbing in (\ref{eq:matconstraint}), I can write
\begin{align}
\mathbf{A}\mathbf{Q}^T\mathbf{Q}\mathbf{x} + \mathbf{A}\mathbf{P}_i\left(\mathbf{P}_j\mathbf{x} – \mathbf{C}\mathbf{y} – \mathbf{d}\right) = \mathbf{0}.\label{eq:final}\tag{4}
\end{align}

Questions

  • Is (\ref{eq:final}) the correct requisite linear system that I must solve? I may have made some silly indexing mistakes in the above, but are the concepts correct?
  • Will the resulting system (\ref{eq:final}) be invertible, given $\mathbf{A}$ is invertible? Alternatively, are there any conditions which guarantee that (\ref{eq:final}) is uniquely solvable?
  • Is my approach a "standard" approach for how one would solve such a problem? I have found that some similar problems require least squares solvers or some kind of optimization (e.g. this and this), but am I correct in assuming that those aren't needed here, and a linear solver is enough? I don't fully understand when and why one would pick an optimization routine over the above approach. In particular, this question seems similar to mine, but apparently there the answer is to simply invert the original system, and the boundary conditions are ignored, according to the only answer. The right-hand side in that question is not zero though – is that the key difference?

Thank you!

Best Answer

  1. Add a row to $x$, $x_{n+1}$; add a row to $A$ that looks like $\pmatrix{0&0&\dots&0}$. Add a "1" at the bottom of the $0$ vector on the right-hand side. You now have the same system, except you have $x_{n+1} = 1$.

  2. Rewrite this equation: \begin{align} x_i = x_j - c_iy_i - d, \end{align} as \begin{align} 1 \cdot x_i - 1 \cdot x_k + d \cdot x_{n+1} = c_iy_i \end{align} and define $z_i =c_i y_i$ to get rid of an extraneous variable.

Now let $h$ be a vector with $+1$ in the $i$th slot, a $-1$ in the $j$th slot, and $d$ in the $n+1$th slot, and zeroes everywhere else. You can append the vector $h$ to the bottom of $A$, and append $z_i$ to the bottom of the right-hand side. Do this once for each relation of the form you've written above.

Now $A$ has more rows than columns, so you can't just invert it. But that's OK, because in general there won't be any solutions to your enlarged system.

Indeed, there's generally only one solution to the initial system. Because $$ Ax = 0 $$ and $A$ is invertible, we have $$ A^{-1}Ax = A^{-1} 0 \\ Ix = A^{-1}0 \\ x = 0 $$ So...if you start constraining the $x_i$ entries to be nonzero, you're going to get no solutions.

Related Question