Quadratic forms are great! They are related to some pretty great stuff such as bilinear forms and the Arf invariant. Quadratic forms in general encode the so-called "quadric surfaces" such as ellipses, hyperbolic paraboloids, and so on. The principal axis theorem, also known as the spectral theorem, is one of the most important theorems in linear algebra! It is what allows us to "transform" the quadratic forms your professor mentioned.
Take a quadratic form $q: \Bbb R^n \to \Bbb R$ defined by $x \mapsto x^tAx$. Since $A$ is symmetric (or can be made symmetric pretty easily), the principal axis theorem says we may orthogonally diagonalize it! This is what eliminates any of the cross-terms such as $x_1x_2$. Going through with the orthogonal diagonalization, $x^tAx = x^tQDQ^tx = (Q^tx)^tD(Q^tx) = y^tDy$. This matrix $D$ is diagonal, and its diagonal entries are the eigenvalues of $A$.
The significance of this "transformed" quadratic form is that it is more meaningful in terms of the information it encodes. Without those pesky cross-terms, we can see exactly what the quadric surface is without the fluff. The easiest surfaces to identify are those of the form $a_1y_1^2 + a_2y_2^2 + a_3y_3^2$ since the signs of $a_1, a_2$ and $a_3$ are how we distinguish between ellipsoid, paraboloid, etc.
They are also of great use in physics when we are dealing with the inertia tensor of a rigid body. They are about one of the coolest things we learn about in first-year linear algebra!
Edit:
Check out these notes by professor Mike Hopkins at Harvard about quadratic and bilinear forms. Professor Hopkins gave a really good lecture at Northwestern this past May in which he discussed some of the more high-level aspects of quadratic forms and how they connect to the Arf invariant. His lecture and these notes are accessible to anyone taking a linear algebra course. These notes in particular should help you to make some "aha!" moments and deeper connections/intuitions about quadratic forms.
http://math.harvard.edu/~mjh/northwestern.pdf
To add to amd's comment, given a $C^2$, real-valued function $f$ of $n$ variables, and a critical point $x_0$ of the function, we can Taylor expand $f$ to second-order to discern the nature of the critical value. That is,
$$
f(x) = f(x_0) + \tfrac{1}{2}x^tHx + o(\Vert{x}\Vert^2),
$$
where $H$ is the Hessian of $f$, and it encodes all second-order partials of $f$ at the point $x_0 \in \Bbb R^n$. Since $f$ is $C^2$, the Hessian of $f$ is symmetric, and we may orthogonally diagonalize $H$ (this is "transforming" the quadratic form via a change of variables):
$$
f(y) = f(y_0) + \tfrac{1}{2}y^tDy + o(\Vert{y}\Vert^2).
$$
From $D$, we can pick off right away whether $x_0$ (equivalently, $y_0$) is a local max, min, or neither since the entries of $D$ along its diagonal are the eigenvalues of the Hessian. If $D$ has strictly positive eigenvalues, then $x_0$ is a minimum (think concave up), and if $D$ has strictly negative eigenvalues, then $x_0$ is a maximum (think concave down). If $D$ has both positive and negative eigenvalues, $x_0$ is a saddle point.
In short, this makes the classification of extrema simpler, thanks to the fact that the second-order term in the Taylor expansion of $f$ about a critical point is itself a quadratic form.
First note that the column space $R(A)$ is being mapped by $P$ identically to itself. Indeed, for a vector $x$ in the domain, we have
$\require{extpfeil}\Newextarrow{\xmapsto}{5,5}{0x27FC}$
$$Ax \,\xmapsto{A^T} A^TAx \,\xmapsto{(A^TA)^{-1}} x \,\xmapsto{A} Ax.$$
On the other hand, for a vector $y \in R(A)^\perp$ recall that $R(A)^\perp = N(A^T)$ so
$$y\,\xmapsto{A^T} 0 \,\xmapsto{A(A^TA)^{-1}} 0.$$
Finally, the domain can be decomposed as $R(A) \oplus R(A)^\perp$ so with respect to this decomposition we have $P = I \oplus 0$, which means precisely that $P$ is an orthogonal projection onto $R(A)$.
Best Answer
The scalars are being subtracted from the diagonal because those scalars times the identity matrix are being subtracted from the other matrix:
$\begin{bmatrix} 3 & 1 & 4\\ 1 & 5 & 9\\ 2 & 6 & 5 \end{bmatrix}$-λ$\begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{bmatrix}$
One way to visualize this would be to imagine the transformation of the first matrix, imagine the transformation of the scalar matrix, and then imagine subtracting the vectors from the second transformation from the first.
That's why this method finds eigenvalues. In both cases the eigenvectors are basically just being scaled by the same amount, so when they're subtracted from one another they produce the zero vector.