But here is what I don't understand. He says $P^TAP$ represents the linear transformation $T: x \mapsto Ax$. What does he mean here?
The author actually states that "$P^TAP$ represents the linear transformation $T: x \mapsto Ax$ in the basis $B$".
So it is a matter of understanding what does in the basis $B$ mean.
What does in the basis $B$ mean?
The standard basis (of $\mathbb{R}^n$) is the set $e := \{e_1, e_2, \dotsc, e_n\}$,
where $e_i$ is the vector whose $i$-th element is $1$, and all others are $0$, that is,
$e_1 := (1, 0, 0, \dotsc, 0)^T,
e_2 := (0, 1, 0, \dotsc, 0)^T, \dotsc,
e_n := (0, 0, 0, \dotsc, 1)^T$.
We usually identify a given vector
$x := (x_1, x_2, \dotsc, x_n)^T$
with the point
$x_e = x_1 e_1 + x_2 e_2 + \dotsb + x_n e_n$
of $\mathbb{R}^n$.
What we are actually doing is regarding $x$ as a vector in the standard basis.
Now what if we were considering $x$ in an arbitrary orthonormal basis of $\mathbb{R}^n$, say,
$B := \{v_1, v_2, \dotsc, v_n\}$?
In this case, $x$ would correspond to the point
$x_B = x_1 v_1 + x_2 v_2 + \dotsb + x_n v_n$
of $\mathbb{R}^n$. That is, each coordinate $x_i$ of $x$ tells us how many times
the corresponding component $v_i$ of the basis $B$ should be accounted for.
We could regard "vector $x$ in the basis $B$" as a shorthand for that;
and should assume that when we don't specify a basis,
we are working under standard basis.
What the author is saying is that the matrix $A' = P^TAP$ in basis $B$
is equivalent to the matrix $A$ in the standard basis. If $v$ is a vector
in standard basis, let $B(v)$ be its equivalent in the basis $B$. In this case,
what the author is saying is that $B(Av) = A'B(v)$. To see why this is true,
we have to see how to perform changes of bases, which I will refer to as translating.
Translating vectors between bases
If $P$ is the matrix whose columns are the vectors in $B$,
then $Px = x_1 v_1 + x_2 v_2 + \dotsb + x_n v_n$ (check this).
Note that this means that $v$ in the basis $B$ is $Pv$ in the basis $e$.
So the matrix $P$ translates a vector in the basis $B$ to the standard basis.
But what if we want to translate a vector $v$ from the standard basis to the basis $B$?
This is the same as asking how to express $v$ as a linear combination of vectors in $B$,
that is, finding a vector $v' := (a_1, a_2, \dotsc, a_n)^T$ such that
$v'_B = a_1 v_1 + a_2 v_2 + \dotsb + a_n v_n = v_e$.
Since $B$ is an orthonormal basis of $\mathbb{R}^n$,
it is simply a matter of considering the orthogonal projections of $v$ on each vector of $B$,
that is, we can take $a_i = v_i^T v$.
If we ponder for a while, we will see that $v' = P^Tv$ (check this too). In other words,
we can use the matrix $P^T$ to translate a vector in the standard basis to a vector in the basis $B$.
Translating transformations to other bases
Now $A$ is a transformation that receives vectors in standard basis and that gives results expressed in the standard basis as well.
If we want to find an equivalent operator that works in terms of a basis $B$,
we can do the following:
- Translate the given vector $v$ in the basis $B$ to the standard basis, that is, get the vector $Pv$;
- Perform the transformation $A$ on this translation, that is, compute $A(Pv)$;
- Translate the result back to basis $B$, that is, do $P^T(A(Pv))$.
From this, we can see that the matrix $P^TAP$ is the transformation $A$ working under the basis $B$.
The columns of a transformation
As for
Also, what he says after that doesn't really make sense to me, i.e. the first column $P^TAP$ is the coordinate vector $T(v_1)$ with respect to $B$
in the proof, the author is simply stating that the first column of $P^TAP$ is the vector $T(v_1)$ expressed in terms of the basis $B$.
"The translation of an eigenvector of a transformation" is
"an eigenvector of the translation of the transformation".
We are only renaming things, so the transform still does the same thing to
$\mathbb{R}^n$, the only difference is how we are addressing the given and
resulting points.
Since the eigenvector $v_1$ in the standard basis is the vector $e_1$ in the basis $B$,
this means that $e_1$ is an eigenvector of the translated transformation $A' = P^TAP$.
Since both $v_1$ and $e_1$ are unit vectors, they are both scaled by the same factor $\lambda$. So $A'e_1 = \lambda_1 e_1 = (\lambda_1, 0, \dotsc, 0)^T$.
Note that for an arbitrary transformation $M$,
$Me_i$ is simply the $i$-th column of $M$.
We can also see this as what the transformation does to each axis of our basis:
it takes the vector $e_i$ to the $i$-th column vector of $M$.
In the basis $B$, the transformation $A'$ takes $e_1$ to
$\lambda_1 e_1$, so the first column of $A'$ is simply $\lambda_1 e_1$.
Best Answer
The identity matrix is symmetric, and is diagonalizable by any invertible matrix $P$ because $P^{1}IP=I$. So such a diagonalization is not necessarily unique.
If $A$ is symmetric, then it has an orthonormal basis $\{ d_1,d_2,\cdots,d_n \}$ of column eigenvectors with corresponding eigenvalues $\{ \lambda_1,\lambda_2,\cdots,\lambda_n \}$. In matrix notation
$$ A\left[\begin{array}{cccc}| & | & | & | & | \\ d_1 & d_2 & d_3 & \cdots & d_4 \\ | & | & | & | & | \end{array}\right] \\ = \left[\begin{array}{cccc} | & | & | & | & | \\ \lambda_1 d_1 & \lambda_2 d_2 & \lambda_3 d_3 & \cdots & \lambda_n d_n \\ | & | & | & | & | \end{array}\right] \\ = \left[\begin{array}{ccccc}| & | & | & | & | & \\ d_1 & d_2 & d_3 & \cdots & d_n \\ | & | & | & \vdots & |\end{array}\right] \left[\begin{array}{ccccc} \lambda_1 & 0 & 0 & \cdots & 0 \\ 0 & \lambda_2 & 0 & \cdots & 0 \\ 0 & 0 & \lambda_3 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & \lambda_n \end{array}\right] $$ So $AU=UD$ or $A=UDU^{-1}$, where $D$ is diagonal. The matrix $U$ is orthogonal because the columns form an orthonormal basis, thereby forcing $U^{T}U=I$.
Conversely, if $A=UDU^{-1}$ where $D$ is diagonal and $U$ is an orthogonal matrix, then every column of $U$ is an eigenvector of $A$ because $AU=UD$.