I have the following matrix
$$P(s) := \begin{bmatrix}
s^2 & s-1 \\
s & s^2
\end{bmatrix}$$
How does one compute the Smith normal form of this matrix? I can't quite grasp the algorithm.
control theorymatricespolynomialssmith-normal-form
I have the following matrix
$$P(s) := \begin{bmatrix}
s^2 & s-1 \\
s & s^2
\end{bmatrix}$$
How does one compute the Smith normal form of this matrix? I can't quite grasp the algorithm.
The computation requires two processes:
Specifically, you are allowed to
The first goal is to reach diagonal form. Let's first work on column 1: using operation 3 for rows repeatedly, you can, by following the Euclidean algorithm, form a row whose first element is the GCD of the elements in column 1. You can then obtain a matrix with the GCD in the $(1,1)$ position and zeroes in the rest of column 1.
Now work on row 1: do the same thing, but using column operations; eventually you will have the GCD of row 1 in the $(1,1)$ position, and zeroes elsewhere in row 1. You will most likely have messed up column 1, but that's OK. Go back and redo column 1, then redo row 1, and repeat until all elements in row and column 1 are 0 except for the $(1,1)$ element. This process is guaranteed to terminate because the GCD gets smaller each time.
Now you can move on to row/column 2, and repeat the process. Continue for row/column 3, and so on, until you have reached diagonal form.
You may not be done at this point, because the diagonal elements may not satisfy the divisibility requirement of the Smith normal form. You can, however, enforce this by some additional moves as in the following example: $$ \begin{bmatrix}8 & 0\\0 & 12\end{bmatrix}\longrightarrow\begin{bmatrix}8 & 8\\0 & 12\end{bmatrix}\longrightarrow\begin{bmatrix}8 & 8\\-8 & 4\end{bmatrix}\longrightarrow\begin{bmatrix}24 & 8\\0 & 4\end{bmatrix}\longrightarrow\begin{bmatrix}24 & 0\\0 & 4\end{bmatrix}\longrightarrow\begin{bmatrix}4 & 0\\0 & 24\end{bmatrix}. $$ The idea is again to use the Eulidean algorithm. After adding column 1 to column 2, you may have to do several row operations to obtain the GCD of the two original diagonal elements. In the example above, only one row operation was needed for this.
Addendum: Details for the particular example in the OP's question. If you add row 2 to row 1, and then multiply row 1 by $-1$, you get a pivot of 1. You can then subtract suitable multiples of row 1 from rows 2 and 4, and end up with $$ \begin{bmatrix} 1 & 561 & -174 & -80 \\ 0 & -3477 & 1080 & 474 \\ 0 & -255 & 81 & 24 \\ 0 & -4182 & 1299 & 570 \end{bmatrix}, $$ which, by subtracting multiples of column 1 from columns 2, 3, and 4, leads to $$ \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & -3477 & 1080 & 474 \\ 0 & -255 & 81 & 24 \\ 0 & -4182 & 1299 & 570 \end{bmatrix}. $$ We next need to do a series of row operations involving rows 2, 3, and 4 that results in the GCD of 3477, 255, and 4182. If we always choose the pivot of smallest nonzero magnitude, the steps would be
You should now have $$ \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 3 & 234 & -1410 \\ 0 & 0 & -651 & 3906 \\ 0 & 0 & -147 & 882 \end{bmatrix}. $$ You can now subtract suitable multiples of column 2 from columns 3 and 4. Then you just have to deal with the $2\times2$ block in the lower right. You should be able to get it from here.
I'll answer your second question first.
Let
$$P(x) = \begin{bmatrix}1&0&0&0\\0&1&0&0\\0&0&(x-2)^2&0\\0&0&0&(x-1)(x-2)\end{bmatrix}.$$
We need to fix the bottom right $2 \times 2$ principal submatrix. I have explained how to do that here, so I'll just present the results here:
\begin{align*} \begin{bmatrix}1&0&0&0\\0&1&0&0\\0&0&(x-2)^2&0\\0&0&0&(x-1)(x-2)\end{bmatrix} &\mapsto \begin{bmatrix}1&0&0&0\\0&1&0&0\\0&0&(x-2)^2&(x-1)(x-2)\\0&0&0&(x-1)(x-2)\end{bmatrix} \\ &\mapsto \begin{bmatrix}1&0&0&0\\0&1&0&0\\0&0&(x-2)^2&(x-2)^2+(x-2)\\0&0&0&(x-1)(x-2)\end{bmatrix} \\ &\mapsto \begin{bmatrix}1&0&0&0\\0&1&0&0\\0&0&(x-2)^2&x-2\\0&0&0&(x-1)(x-2)\end{bmatrix} \\ &\mapsto \begin{bmatrix}1&0&0&0\\0&1&0&0\\0&0&x-2&(x-2)^2\\0&0&(x-1)(x-2)&0\end{bmatrix} \\ &\mapsto \begin{bmatrix}1&0&0&0\\0&1&0&0\\0&0&x-2&0\\0&0&(x-1)(x-2)&-(x-1)(x-2)^2\end{bmatrix} \\ &\mapsto \begin{bmatrix}1&0&0&0\\0&1&0&0\\0&0&x-2&0\\0&0&0&-(x-1)(x-2)^2\end{bmatrix} \\ &\mapsto \begin{bmatrix}1&0&0&0\\0&1&0&0\\0&0&x-2&0\\0&0&0&(x-1)(x-2)^2\end{bmatrix} \end{align*}
Irreducible factors are $x-2$ and $x-1$, while the elementary divisors are $x-2$, $(x-2)^2$ and $x-1$.
I think, but I'm not sure, that what you suggest in your first question can always be done to obtain the correct result. One would have to prove it to actually use it. Personally, I find it easier to just properly reduce the matrix polynomial or to fix it to become the Smith normal form, like I did above.
Best Answer
Following the algorithm from Wilkinson's "The algebraic eigenvalue problem".
I will denote the current polynomial matrix as $P(\lambda)$ and its elements as $p_{ij}(\lambda)$, meaning that their values will change, with the final goal of having $P(\lambda)$ in the Smith form.
We start with
$$P(\lambda) = \begin{bmatrix} \lambda^2 & \lambda - 1 \\ \lambda & \lambda^2 \end{bmatrix}.$$
Pivot $(1,2)$.
Pick one of the polynomials with the lowest degree.
It seems that it would be easier to pick $p_{21}$ due to the step 3, but I'll go for $p_{12}$, as it seems to show better how that part is done. I'll later do the whole algorithm using $p_{21}$ as a pivot, and we'll see that it gets easier step 3, but is more complicated later on.
$$P(\lambda) = \begin{bmatrix} \lambda^2 & \color{red}{\lambda - 1} \\ \lambda & \lambda^2 \end{bmatrix}.$$
Swap rows/columns so that the pivot comes to the position $(1,1)$. In our case, we need to swap the first and the second column. We now have
$$P(\lambda) = \begin{bmatrix} \color{red}{\lambda - 1} & \lambda^2 \\ \lambda^2 & \lambda \end{bmatrix}.$$
Present all the polynomials in the first row and the first column (i.e., either $i=1$ or $j=1$) as $p_{ij}(\lambda) = p_{11}(\lambda) q_{ij}(\lambda) + r_{ij}(\lambda)$. In our case:
$$P(\lambda) = \begin{bmatrix} \lambda - 1 & \color{red}{\lambda^2} \\ \color{red}{\lambda^2} & \lambda \end{bmatrix} = \begin{bmatrix} \lambda - 1 & \color{red}{(\lambda-1)(\lambda+1)+1} \\ \color{red}{(\lambda-1)(\lambda+1)+1} & \lambda \end{bmatrix}.$$
So, in both cases, we got $q_{ij}(\lambda) = (\lambda+1)$ and $r_{ij}(\lambda) = 1$.
Since $r_{ij} \not\equiv 0$, we need to fix that. Let us first deal with $r_{21}$. We substract $q_{21}$ times the first row from second row (the one in which is our $r_{21}$) and then interchange those two rows.
First, we compute $q_{12}$ times the first row:
$$q_{12}(\lambda) \begin{bmatrix} \lambda - 1 & \lambda^2 \end{bmatrix} = \begin{bmatrix} (\lambda+1)(\lambda - 1) & (\lambda+1)\lambda^2 \end{bmatrix} = \begin{bmatrix} \lambda^2 - 1 & \lambda^3 + \lambda^2 \end{bmatrix}.$$
Substracting that from the second row, we get
$$\begin{bmatrix} \lambda^2 & \lambda \end{bmatrix} - \begin{bmatrix} \lambda^2 - 1 & \lambda^3 + \lambda^2 \end{bmatrix} = \begin{bmatrix} 1 & -\lambda^3 - \lambda^2 + \lambda \end{bmatrix}.$$
Now, exchanging it with the first row, we obtain
$$P(\lambda) = \begin{bmatrix} 1 & -\lambda^3 - \lambda^2 + \lambda \\ \lambda - 1 & (\lambda-1)(\lambda+1)+1 \end{bmatrix}.$$
This is to be repeated until all elements of the first row and the first column are divisible by $p_{11}$. In our case, $p_{11} \equiv 1$, so we got that, and we can write them in the form $p_{ij} = p_{11}q_{ij}$:
$$P(\lambda) = \begin{bmatrix} 1 & \color{red}{1 \cdot (-\lambda^3 - \lambda^2 + \lambda)} \\ \color{red}{1 \cdot (\lambda - 1)} & (\lambda-1)(\lambda+1)+1 \end{bmatrix}.$$
We now subtract the appropriate multiples of the first row from the rows $i=2,\dots,n$ and the appropriate multiples of the first column from the columns $ji=2,\dots,n$.
Let us first find the multiple of the first row, to be subtracted from the second one, denoting it as ${\rm row}_2$. Since $p_{11} \equiv 1$, we see that ${\rm row}_2$ equals the first row multiplied by $p_{21}(\lambda)$:
\begin{align*} {\rm row}_2 &= p_{21}(\lambda) \begin{bmatrix} 1 & -\lambda^3 - \lambda^2 + \lambda \end{bmatrix} = \begin{bmatrix} (\lambda - 1) \cdot 1 & (\lambda - 1) \cdot (-\lambda^3 - \lambda^2 + \lambda) \end{bmatrix} \\ &= \begin{bmatrix} \lambda - 1 & -\lambda^4 + 2 \lambda^2 - \lambda \end{bmatrix}.\end{align*}
We now subtract these from the current second row and the current second column:
\begin{align*} \begin{bmatrix} \lambda - 1 & \lambda^2 \end{bmatrix} - {\rm row}_2 &= \begin{bmatrix} \lambda - 1 & \lambda^2 \end{bmatrix} - \begin{bmatrix} \lambda - 1 & -\lambda^4 + 2 \lambda^2 - \lambda \end{bmatrix} \\ &= \begin{bmatrix} 0 & \lambda^4 - \lambda^2 + \lambda \end{bmatrix}. \\ \end{align*}
Now, we need to be careful here, because we have just changed the second column and must not be working with the old one above. Observe our current $P(\lambda)$:
$$P(\lambda) = \begin{bmatrix} 1 & -\lambda^3 - \lambda^2 + \lambda \\ \color{red}{0} & \color{red}{\lambda^4 - \lambda^2 + \lambda} \end{bmatrix}.$$
Obviously, subtracting the first column times $(-\lambda^3 - \lambda^2 + \lambda)$ gives us
$$P(\lambda) = \begin{bmatrix} 1 & \color{red}{0} \\ 0 & \color{red}{\lambda^4 - \lambda^2 + \lambda} \end{bmatrix}.$$
Repeat the process on the bottom right principal submatrix of order $n-1$. In our case, $n = 2$, so the job is done, and the Smith form is
$$P(\lambda) = \begin{bmatrix} 1 & 0 \\ 0 & \lambda^4 - \lambda^2 + \lambda \end{bmatrix}.$$
Pivot $(2,1)$.
Let us now do the same process, but picking $p_{21}(\lambda) = \lambda$ as the pivot in the first step.
Pick one of the polynomials with the lowest degree.
This time, we choose $p_{21}$.
$$P(\lambda) = \begin{bmatrix} \lambda^2 & \lambda - 1 \\ \color{red}{\lambda} & \lambda^2 \end{bmatrix}.$$
Swap rows/columns so that the pivot comes to the position $(1,1)$. In our case, we need to swap the first and the second row. We now have
$$P(\lambda) = \begin{bmatrix} \color{red}{\lambda} & \lambda^2 \\ \lambda^2 & \lambda - 1 \end{bmatrix}.$$
Present all the polynomials in the first row and the first column (i.e., either $i=1$ or $j=1$) as $p_{ij}(\lambda) = p_{11}(\lambda) q_{ij}(\lambda) + r_{ij}(\lambda)$. In our case:
$$P(\lambda) = \begin{bmatrix} \lambda & \color{red}{\lambda^2} \\ \color{red}{\lambda^2} & \lambda - 1 \end{bmatrix} = \begin{bmatrix} \lambda & \color{red}{\lambda \cdot \lambda + 0} \\ \color{red}{\lambda \cdot \lambda + 0} & \lambda - 1 \end{bmatrix}.$$
So, in both cases, we got $q_{ij}(\lambda) = \lambda$ and $r_{ij}(\lambda) = 0$. This is great, as we can skip the rest of the step 3. $\ddot\smile$
We now subtract the appropriate multiples of the first row from the rows $i=2,\dots,n$ and the appropriate multiples of the first column from the columns $ji=2,\dots,n$.
Let us first find the multiple of the first row, to be subtracted from the second one, denoting it as ${\rm row}_2$. Since $p_{11}(\lambda) = \lambda$ and $p_{21}(\lambda) = \lambda^2$, we see that ${\rm row}_2$ equals the first row multiplied by $(p_{21}/p_{11})(\lambda) = \lambda$:
$${\rm row}_2 = (p_{21}/p_{11})(\lambda) \begin{bmatrix} \lambda & \lambda^2 \end{bmatrix} = \begin{bmatrix} \lambda^2 & \lambda^3 \end{bmatrix}.$$
We now subtract this from the current second row:
$$\begin{bmatrix} \lambda^2 & \lambda - 1 \end{bmatrix} - {\rm row}_2 = \begin{bmatrix} 0 & -\lambda^3 + \lambda - 1 \end{bmatrix}.$$
Our current $P(\lambda)$ is
$$P(\lambda) = \begin{bmatrix} \lambda & \lambda^2 \\ \color{red}{0} & \color{red}{-\lambda^3 + \lambda - 1} \end{bmatrix}.$$
As before, we subtracting the first column's multiple will now just remove the nonzeroes in the first row.
$$P(\lambda) = \begin{bmatrix} \lambda & \color{red}{0} \\ 0 & \color{red}{-\lambda^3 + \lambda - 1} \end{bmatrix}.$$
Now, this may seem great, but notice that $p_{11} \nmid p_{22}$, so we need further corrections. This is done by adding the second row to the first one, repeating the above procedure. So, we get
$$P(\lambda) = \begin{bmatrix} \color{red}{\lambda} & \color{red}{-\lambda^3 + \lambda - 1} \\ 0 & -\lambda^3 + \lambda - 1 \end{bmatrix} = \begin{bmatrix} \lambda & \color{red}{\lambda \cdot (-\lambda^2 + 1) + (-1)} \\ \color{red}{0} & -\lambda^3 + \lambda - 1 \end{bmatrix}.$$
We see that $q_{12}(\lambda) = -\lambda^2 + 1$ and $r_{12}(\lambda) = 1$. We need to subtract the first column times $q_{12}(\lambda)$ from the second colum:
$$P(\lambda) = \begin{bmatrix} \lambda & \color{red}{-1} \\ 0 & \color{red}{-\lambda^3 + \lambda - 1} \end{bmatrix}.$$
Exchanging these two, we get
$$P(\lambda) = \begin{bmatrix} -1 & \lambda \\ -\lambda^3 + \lambda - 1 & 0 \end{bmatrix}.$$
After eliminating $p_{12}$, we have
$$P(\lambda) = \begin{bmatrix} -1 & \color{red}{0} \\ -\lambda^3 + \lambda - 1 & \color{red}{-\lambda^4 + \lambda^2 - \lambda} \end{bmatrix}.$$
Then we eliminate $p_{21}$:
$$P(\lambda) = \begin{bmatrix} -1 & 0 \\ \color{red}{0} & \color{red}{-\lambda^4 + \lambda^2 - \lambda} \end{bmatrix}.$$
Repeat the process on the bottom right principal submatrix of order $n-1$. In our case, as above, $n = 2$, so the job is almost done, as we have
$$P(\lambda) = \begin{bmatrix} -1 & 0 \\ 0 & -\lambda^4 + \lambda^2 - \lambda \end{bmatrix}.$$
For this to be a Smith form, we need the nonzero diagonal polynomials to be monic (i.e., have a leading coefficient equal to $1$). In our case, we do this by multiplying $P(\lambda)$ by $-1$. In a more generale case, we'd need to multiply it from either left or right by some constant diagonal matrix.
Finally, we obtain
$$P(\lambda) = \begin{bmatrix} 1 & 0 \\ 0 & \lambda^4 - \lambda^2 + \lambda \end{bmatrix}.$$
Concluding remarks
In both cases we got monic polynomials $p_{ii}$ such that $p_{11} \mid p_{22}$, and the degree of $(p_{11} \cdot p_{22})(\lambda)$ is $4 = 2 \cdot 2 = l \cdot n$, where $n$ is the order of the matrix and $l$ is the degree of $P$, as it should be. This is due to the fact that $\det P(\lambda)$ remained unchanged (in a more general case, it might get multiplied by a constant factor, in order to get the polynomials on the diagonal to become monic).
I have discarded all the info about the transformations. If one is to keep that, the row transformations are put in $E(\lambda)$, and the column transformations are put in $F(\lambda)$, to obtain
$$E(\lambda) P_{\text{starting}}(\lambda) F(\lambda) = P_{\text{final}}(\lambda).$$
where
$$ E(\lambda)=\begin{bmatrix}-1& 1 + \lambda - \lambda^2\\ -\lambda& -1 + \lambda + \lambda^2 - \lambda^3\end{bmatrix}$$
and
$$ F(\lambda)=\begin{bmatrix}1 - \lambda& -1 + \lambda - \lambda^2 - \lambda^3 + \lambda^4\\ 1& \lambda - \lambda^3\end{bmatrix}$$