That matrix is symmetric. It is a consequence of linear algebra that a symmetric matrix is orthogonally diagonalizable. That means there are two perpendicular directions upon which that matrix acts as scaling by $\lambda_1$ and by $\lambda_2$.
These $\lambda_i$ represent the quadratic coefficient of a parabolic approximation to the function $f$ at $(x_0,y_0)$ as you move through in the direction of each eigenspace. Since you already are looking at a critical point, the quadratic approximation is reaching its tip at $(x_0,y_0)$. If the two $\lambda_i$ are opposite in sign, you will have two parabolas orthogonal to each other opening in different directions, clearly creating a saddle. If you have two $\lambda_i$ that are of the same sign, then depending on that sign you either have a max or a min.
But the determinant of a $2\times2$ matrix works out to be the same thing as the product of the two eigenvalues. So you can see how a negative determinant implies $\lambda_i$ of opposite sign, which implies a saddle point, and a positive determinant similarly implies either a max or a min.
Locally at any $(x_0,y_0)$, there is a higher dimensional version of the Taylor series, grouped here by increasing order of derivative:
$$\begin{align*}
f(x,y)&=f(x_0,y_0)+\Big[f_x(x_0,y_0)\cdot(x-x_0)+f_y(x_0,y_0)\cdot(y-y_0)\Big]\\
&\phantom{{}={}}+\frac12\Big[f_{xx}(x_0,y_0)\cdot(x-x_0)^2+f_{xy}(x_0,y_0)\cdot(x-x_0)(y-y_0)\\
&\phantom{{}={}}+f_{yx}(x_0,y_0)\cdot(y-y_0)(x-x_0)+f_{yy}(x_0,y_0)\cdot(y-y_0)^2\Big]+\cdots\\
&=f(x_0,y_0)+\nabla f(x_0,y_0)\cdot\left((x,y)-(x_0,y_0)\right)^T\\
&\phantom{{}={}}+\frac12\left((x,y)-(x_0,y_0)\right)\cdot H(x_0,y_0)\cdot\left((x,y)-(x_0,y_0)\right)^T+\cdots
\end{align*}$$
When you are at a critical point, this simplifies to
$$\begin{align*}
f(x,y)&=f(x_0,y_0)+\frac12\left((x,y)-(x_0,y_0)\right)\cdot H(x_0,y_0)\cdot\left((x,y)-(x_0,y_0)\right)^T+\cdots
\end{align*}$$
And if we could change coordinates to an $s$ and $t$ variable that run in the directions of $H$'s eigenspaces, based at the critical point, we'd just have
$$f(s;t)=f(0;0)+\frac12\lambda_1s^2+\frac12\lambda_2t^2+\cdots$$ which I hope helps to see the parabolas and the role of the eigenvalues of $H$.
Best Answer
The reason why regular college textbooks don't justify why this works, is because the proof is too advanced. You cannot get 20 year old college kids, who are taking their 3rd course in Calculus, to truly understand it. You need to dig deeper and check out Real Analysis textbooks. Find a Real Analysis textbook, go to the multivariate part, and you'll find a proof.
A hint on maybe finding some intuitive answer about why this works: on YouTube there's a playlist called Khan Academy multivariate calculus which is taught by Grant Sanderson, the man behind the 3Blue1Brown YouTube channel. Search for the video about the second derivative test, and you will probably have your answer. Grant likes to give intuitive explanations about why things are the way they are instead of just hammering the proof on your soul.