It would’ve be good if you had quoted the theorem verbatim from your source because I think that you might be misreading it and seem to be missing some important details from its statement. The theorem makes two separate but related claims. For a diagonalizable operator:
- Expressed in a basis of eigenvectors, the matrix of the operator is diagonal.
- Given an orthonormal basis of eigenvectors, the operator can be decomposed into a linear combination of projectors onto the spans of the individual eigenvectors. This is the content of the identity $\hat A = \sum_{k,j} A_{kj}\,\lvert\varphi_k\rangle \langle\varphi_j\rvert$.
In order to understand these claims correctly it’s important to distinguish between a vector $\lvert v\rangle$ and its coordinate representation relative to some basis $\mathcal B$, which I’ll denote $[v]_{\mathcal B}\in\mathbb C^n$. This distinction can be easy to forget about when the vector itself is an element of $\mathbb C^n$. Similarly, one must distinguish between the operator $\hat A$ and its matrix representation $[\hat A]_{\mathcal B'}^{\mathcal B}$.
Claim #2 above is a statement about an operator and its eigenvectors. It is true independent of the basis in which you choose to represent these vectors and operator. Note that this representation basis isn’t necessarily the orthonormal eigenbasis that’s the subject of the theorem—it can be any orthonormal basis whatsoever. Claim #1, on the other hand, is about the representation of the operator $\hat A$ in a particular basis. Specifically, it says that if $\mathcal B = \{\lvert\phi_i\rangle\}$ is an orthonormal basis that consists of eigenvectors of $\hat A$, then the matrix $[\hat A]_{\mathcal B}^{\mathcal B}$ is diagonal.
When you tried to verify the first claim by applying the decomposition in the second, you did so relative to the standard basis $\mathcal E$. That is, you computed $[\hat A]_{\mathcal E}^{\mathcal E} = \sum_i \lambda_i ([\phi_i]_{\mathcal E})^T[\phi_i]_{\mathcal E}$, but this is just your original matrix. Instead, you need to express everything relative to the eigenbasis $\mathcal B=\{\lvert\phi_i\rangle\}$: $[\hat A]_{\mathcal B}^{\mathcal B} = \sum_i \lambda_i([\phi_i]_{\mathcal B})^T[\phi_i]_{\mathcal B}$.
In order to do this, you first need an orthonormal eigenbasis. The two eigenvectors that you found are orthogonal, but you do need to normalize them as you’ve done in a later edit. We then have $$[\varphi_1]_{\mathcal B} = \begin{bmatrix}1\\0\end{bmatrix} \text{ and } [\varphi_2]_{\mathcal B} = \begin{bmatrix}0\\1\end{bmatrix},$$ so $$5 ([\varphi_1]_{\mathcal B})^T[\varphi_1]_{\mathcal B} + 1([\varphi_2]_{\mathcal B})^T[\varphi_2]_{\mathcal B} = 5\begin{bmatrix}1&0\\0&0\end{bmatrix}+\begin{bmatrix}0&0\\0&1\end{bmatrix} = \begin{bmatrix}5&0\\0&1\end{bmatrix},$$ which is diagonal as claimed.
It’s worth noting that this diagonal decomposition is just a special case of a more general outer product decomposition of an operator. For any orthonormal basis, $\hat I = \sum_i \lvert i\rangle\langle i\rvert$, and so $$\hat A = \hat I\hat A\hat I = \sum_{i,j} \langle i \rvert \hat A \lvert j \rangle \lvert i\rangle\langle j\rvert.$$ Now, if $\lvert i\rangle$ also happens to be an eigenvector of the diagonalizable operator $\hat A$ with eigenvalue $\lambda_i$, then $\hat A\lvert i\rangle = \lambda_i\lvert i\rangle$ and $\langle i\rvert \hat A\lvert j\rangle = \lambda_i\delta_{ij}$, which leads to the diagonal decomposition $$\hat A = \hat I\hat A\hat I = \sum_{i,j} \langle i \rvert \hat A \lvert j \rangle \lvert i\rangle\langle j\rvert = \sum_{i,j} \lambda_i\delta_{ij} \lvert i\rangle\langle j\rvert = \sum_i \lambda_i \lvert i\rangle\langle i\rvert.$$ Relative to this basis, the corresponding matrix is diagonal.
Best Answer
Matrices only define linear transformations relative to some basis. They don't describe a linear transformation on their own. Thus implicit in $$T=\begin{pmatrix} 2&-3 \\ 3 & 2 \end{pmatrix}$$ is the statement that the vector space has some basis already and that this is the matrix for $T$ with respect to that basis. E.g., perhaps it's $k^2$, so it has the standard basis vectors $\begin{pmatrix} 1 \\ 0\end{pmatrix}$ and $\begin{pmatrix} 0 \\ 1\end{pmatrix}$ already, and $T$ has that matrix with respect to that basis.
If $V$ is a vector space, then a linear transformation $T:V\to V$ is not a matrix, but rather a function with nice properties that respect the vector space structure. We can then describe it using bases and a matrix, but that's only a description, and the description depends on the basis used to compute the matrix.
As for how to compute the matrix of a linear transformation with respect to some basis, look here for some random notes I found online, or in any decent linear algebra textbook, like presumably in Axler's book somewhere (It appears to be section 3C). For an example of how to do this in a particular case, you can look at this question.