I haven’t done this in quite some time, so this solution is probably unnecessary complicated:
We identify $\mathbb{R}^{2 \times 2}$ with $\mathbb{R}^4$ via
$$
\mathbb{R}^{2 \times 2} \to \mathbb{R}^4, \,
\begin{pmatrix}
x & y \\
z & t
\end{pmatrix}
\mapsto
(x,y,z,t)^T.
$$
(So the “default basis” you used corresponds to the standard basis $(e_1, e_2, e_3, e_4)$ of $\mathbb{R}^4$.) If we understand $L$ as a linear map $\hat{L} \colon \mathbb{R}^4 \to \mathbb{R}^4$ then $\hat{L}$ is (with respect to the standard basis on both sides) given by the matrix
$$
A =
\begin{pmatrix}
1 & 1 & 0 & 1 \\
1 & 1 & 1 & 0 \\
0 & 1 & 1 & 1 \\
1 & 0 & 1 & 1
\end{pmatrix}.
$$
Also notice that the inner product on $\mathbb{R}^{2 \times 2}$ corresponds to the standard scalar product on $\mathbb{R}^4$ because
$$
\left\langle
\begin{pmatrix}
a_{11} & a_{12} \\
a_{21} & a_{22}
\end{pmatrix},
\begin{pmatrix}
b_{11} & b_{12} \\
b_{21} & b_{22}
\end{pmatrix}
\right\rangle
= a_{11} b_{11} + a_{12} b_{12} + a_{21} b_{21} + a_{22} b_{22}.
$$
(This also justifies called is the default inner product.) So to find an orthonormal basis of $\mathbb{R}^{2 \times 2}$ with respect to which $L$ is diagonal is the same as finding an orthogonal basis of $\mathbb{R}^4$ with respect to which $\hat{L}$ is represented a diagonal matrix.
There are now different ways to solve this problem. We will first calculate the eigenspaces of $\hat{L}$; because $A$ is symmetric we know that $\hat{L}$ is diagonalizable. Then we will use the following fact:
Proposition: Let $S \in \mathbb{R}^{n \times n}$ be symmetric and $x,y \in \mathbb{R}^n$ eigenvalues of $S$ to eigenvalues $\lambda \neq \mu$. Then $x$ and $y$ are orthogonal.
Proof: Notice that
\begin{align*}
\lambda \langle x,y \rangle
&= \langle \lambda x, y \rangle
= \langle Ax, y \rangle
= (Ax)^T y
= x^T A^T y
= x^T A y \\
&= \langle x, A y \rangle
= \langle x, \mu y \rangle
= \mu \langle x, y \rangle.
\end{align*}
Because $\lambda \neq \mu$ it follows that $\langle x,y \rangle = 0$.
So the eigenspaces of different eigenvalues are orthogonal to each other. Therefore we can compute for each eigenspace an orthonormal basis and them put them together to get one of $\mathbb{R}^4$; then each basis vectors will in particular be an eigenvectors $\hat{L}$.
By some lengthy calculation it can be shown that the characteristic polynomial of $A$ is given by
$$
\chi_A(t) = t^4 - 4 t^3 + 2 t^2 + 4t - 3.
$$
It is easy to guess the roots $1$ and $-1$, so we can factor $\chi_A$ and get
$$
\chi_A(t) = (t-1)^2 (t+1) (t-3).
$$
The eigenspaces can now be calculated as usual, and we find that
$$
E_1 = \langle (0,-1,0,1)^T, (-1,0,1,0)^T \rangle, \;
E_{-1} = \langle (-1,1,-1,1)^T \rangle, \;
E_3 = \langle (1,1,1,1)^T \rangle,
$$
where $E_\lambda$ denotes the eigenspace with respect to the eigenspace $\lambda$.
Next we need to find orthonormal basis for each eigenspace. We can always do this by picking some basis and then using Gram–Schmidt. But here we are pretty lucky:
We know the basis $((0,-1,0,1)^T, (-1,0,1,0)^T)$ of $E_1$. Because both basis vectors are already orthogonal to each other we only need to normalize them. So we get $b_1 = \frac{1}{\sqrt{2}}(0,-1,0,1)^T$ and $b_2 = \frac{1}{\sqrt{2}}(-1,0,1,0)^T$.
In the case of $E_{-1}$ and $E_3$ we are even luckier, as they are both one-dimensional. So here too we only need to normalize and thus get $b_3 = \frac{1}{2} (-1,1,-1,1)^T$ and $b_4 = \frac{1}{2}(1,1,1,1)^T$.
Putting these together we have now found a basis $(b_1, b_2, b_3, b_4)$ of $\mathbb{R}^4$ given by
$$
b_1 = \frac{1}{\sqrt{2}} \begin{pmatrix} 0 \\ -1 \\ 0 \\ 1 \end{pmatrix}, \;
b_2 = \frac{1}{\sqrt{2}} \begin{pmatrix} -1 \\ 0 \\ 1 \\ 0 \end{pmatrix}, \;
b_3 = \frac{1}{2} \begin{pmatrix} -1 \\ 1 \\ -1 \\ 1 \end{pmatrix}, \;
b_4 = \frac{1}{2} \begin{pmatrix} 1 \\ 1 \\ 1 \\ 1 \end{pmatrix},
$$
which is orthonormal and cosists of eigenvectors of $\hat{L}$. The corresponding $2 \times 2$ matrices are
\begin{align*}
B_1 &= \frac{1}{\sqrt{2}} \begin{pmatrix} 0 & -1 \\ 0 & 1 \end{pmatrix}, &
B_2 &= \frac{1}{\sqrt{2}} \begin{pmatrix} -1 & 0 \\ 1 & 0 \end{pmatrix}, \\
B_3 &= \frac{1}{2} \begin{pmatrix} -1 & 1 \\ -1 & 1 \end{pmatrix}, &
B_4 &= \frac{1}{2} \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix}.
\end{align*}
Let's set $\mathcal{E} = (e_1, \dots, e_n)$ and $\mathcal{F} = (f_1, \dots, f_n)$. Also, let's denote the coordinates of a vector $v$ with respect to a basis $\mathcal{E}$ (represented as a column vector) by $[v]_{\mathcal{E}}$. Then
$$ v = \begin{bmatrix} 5 \\ 4 \\ 2 \end{bmatrix}, \, [v]_{\mathcal{E}} = \begin{bmatrix} \xi_1 \\ \xi_2 \\ \xi_3 \end{bmatrix} = \begin{bmatrix} 5 \\ 4 \\ 2 \end{bmatrix}, \, [v]_{\mathcal{F}} = \begin{bmatrix} \eta_1 \\ \eta_2 \\ \eta_3 \end{bmatrix} = \begin{bmatrix} 5 \\ -4 \\ 1 \end{bmatrix}. $$
First of all, it seems you have the role of $S$ and $P^{-1}$ in your question reversed as you actually have
$$ P^{-1} = \begin{bmatrix} 1 & 0 & 0 \\ -1 & 1 & 0 \\ 0 & -1 & 1 \end{bmatrix}, \,\,\, S = \left( P^{-1} \right)^T = \begin{bmatrix} 1 & -1 & 0 \\ 0 & 1 & -1 \\ 0 & 0 & 1 \end{bmatrix}.$$
The multiplications you perform later are also incorrect. However, your actual conclusion is true: In order to compute the coordinates of $[v]_{\mathcal{F}}$ from $[v]_{\mathcal{E}}$, you need to multiply $[v]_{\mathcal{E}}$ by $P^{-1}$, not $\left( P^{-1} \right)^T$. This is the meaning of equation $(12)$ Shilov writes on page 122 (where $Q = P^{-1}$).
However, Shilov is also not entirely wrong in claiming that the matrix "describing the transformation from the components $\xi_1,\dots,\xi_n$ to the components $\eta_1,\dots,\eta_n$ is $\left( P^{-1} \right)^T$". Why? If you write explicitly $Q [v]_{\mathcal{E}} = [v]_{\mathcal{F}}$ (equaton $(12)$) you get
$$ \eta_1 = q_1^{(1)} \xi_1 + \dots + q^{(n)}_1 \xi_n, \\
\dots, \\
\eta_n = q_n^{(1)} \xi_1 + \dots + q^{(n)}_n \xi_n. $$
If you think of the components $\xi_1, \dots, \xi_n$ and $\eta_1, \dots, \eta_n$ as two "bases" and compare this equation to the equation you wrote in the beginning of the question, you will see that the "matrix of the transformation from the basis $\{ \xi \}$ to the basis $\{ \eta \}$" is actually $Q^T$ and not $Q$. To make this statement precise, one needs to discuss dual spaces and then the "components" of a vector with respect to a basis become the dual basis to the original basis and the matrix $\left( P^{-1} \right)^T$ becomes the change of basis matrix between the dual bases.
Best Answer
The coordinates may be orthonormal but the basis vectors do not have to be orthonormal.
Let $v_1=(1,3)$ and $v_2= (3,2)$ then the coordinates of $v_1$ with respect to the basis $\{ v_1, v_2 \}$ is $(1,0)$.
Also the coordinates of $v_2$ with respect to the basis $\{ v_1, v_2 \}$ is $(0,1)$
The coordinates are orthonormal but the vectors are not.