Perhaps you should study some more advanced analysis, since that's when Frechet derivatives come up. A good (and legally free) reference is Applied Analysis by John Hunter and Bruno Nachtergaele.
After that, perhaps Analysis by Elliott H. Lieb and Michael Loss? It's more advanced, so be sure you understand Hunter and Nachtergaele first.
For more intense partial differential equations, UC Davis' upper division PDE's courses are available online too (with homework and solutions) when Nordgren taught it. There is Math 118A: Partial Differential Equations and 118B.
There are probably more advanced (free) references out there, but these are the ones I use...
Addendum: For references on specifically functional analysis, perhaps you should be comfortable with Eidelman et al's Functional Analysis. An introduction (Graduate Studies in Mathematics 66; American Mathematical Society, Providence, RI, 2004); I've heard good things about J.B. Conway's Functional Analysis, although I have yet to read it...
At your link to MathWorld, "linearization" is to be understood as a Fréchet derivative of the appropriate nonlinear mapping, which has become quite habitual nowadays. Your second-order PDE is quasilinear, i.e., linear w.r.t. the highest-order derivatives. A strict formal definition of a quasilinear nonlinear PDE is generally being omitted mostly due to its rather awkward styling that can hardly be avoided. Indeed, in your case, the strict formal definition could look like this. First, we describe $a,b,c,d,e\,$ as functions of $x,y,u$, with $u$ being some unknown function of independent variables $x,y$. Second, we introduce a function $F=F(x,y,p,q_1,q_2,r_{11},r_{12},r_{22})\,$ of eight independent variables being a linear polynomial in its last three variables $\,r_{11},r_{12},r_{22}\,$ reserved for the highest-order deivatives. Namely, in your case, it is
$$
F=a(x,y,p)r_{11}+b(x,y,p)r_{12}+c(x,y,p)r_{22}+d(x,y,p)(q_{1})^2+e(x,y,p)q_{2}\,.
$$
And finally third, we proclaim your equation to be of the form
$$
F\bigl(x,y,u(x,y),u_x(x,y),u_y(x,y),u_{xx}(x,y),u_{xy}(x,y),u_{yy}(x,y)\bigr)=0\quad \forall\, x,y.
$$
At any point $(x,y)$, classification of your equation is determined by the value of the expression
$$
D(x,y)\overset{\rm def}{=}\Bigl(b\bigl(x,y,u(x,y)\bigr)\Bigr)^2
-4a\bigl(x,y,u(x,y)\bigr)\!\cdot\! c\bigl(x,y,u(x,y)\bigr),
$$
with the equation being called elliptic whenever $D(x,y)<0$, hyperbolic whenever $D(x,y)>0$, and parabolic whenever $D(x,y)=0$. There can be absolutely nothing else to it, though one can't help making a remark that identifying the parabolic type with just $D(x,y)=0$ now amounts to a thorough anachronism still tolerated in PDE as something like Historical Landmark.
Fully nonlinear PDE. In case a nonlinear PDE is not quasilinear,
classification is made judging by the linear part of the nonlinear mapping,
i.e., by its Fréchet derivative that dominates questions of
local solvability for the nonlinear mpapping. Just to illustrate how it works,
consider some simple example of the second-order nonlinear partial
differential operator, say,
$$
L(u)\overset{\rm def}{=}F(u_{xx},u_{xy},u_{yy}),
\quad F\in C^1(\mathbb{R}^3),
$$
defined on differentiable functions $u=u(x,y)$ with some suitable choice of
function spaces. The Fréchet derivative of nonlinear mapping $L$ at
the solution $u$ is a linear partial differential operator with variable
coefficients
$$
L_u(v)\overset{\rm def}{=}F_p(u_{xx},u_{xy},u_{yy})v_{xx}+
F_q(u_{xx},u_{xy},u_{yy})v_{xy}+F_r(u_{xx},u_{xy},u_{yy})v_{yy}
$$
where notations $F_p\,,F_q\,,F_r$ are meant to signify the partial derivatives
$$
F_p={\partial_p}F(p,q,r),\quad F_q={\partial_q}F(p,q,r),
\quad F_r={\partial_r}F(p,q,r).
$$
At any point $(x,y)$, classification of the linear partial differential
operator $L_u(v)$ is of course determined by the value of the expression
$$
D_u(x,y)\overset{\rm def}{=}\bigl(F_q(u_{xx},u_{xy},u_{yy})\bigr)^2
-4F_p(u_{xx},u_{xy},u_{yy})\!\cdot\!F_r(u_{xx},u_{xy},u_{yy}).
$$
Hence, the nonlinear partial differential operator $L(u)$ at the solution $u$
is called elliptic at a point $(x,y)$ whenever $D_u(x,y)<0$, and likewise so on.
Best Answer
To every second order elliptic PDE $$Lu = f$$on $\Omega \subset \mathbb{R}^n$, where $L$ is an elliptic operator and $f$ a measurable function, is associated a positive-definite, bounded, symmetric coefficient matrix $$Q(x) = [q_{ij}(x)], i,j = 1,\ldots n.$$ That is, the quadratic form $$\mathcal{Q}(x,\xi) = \xi^\top Q(x) \xi$$ satisfies, for some $c,C > 0$ $$c|\xi|^2 \leq \mathcal{Q}(x,\xi) \leq C|\xi|^2$$ for almost every $x \in \Omega$ and every $\xi \in \mathbb{R}^n$. It has been shown by various authors (but most fundamentally De Giorgi, Nash and Moser) that in the presence of local Sobolev and Poincare inequalities, and with the existence of an accumulating sequence of Lipschitz cutoff functions, that weak solutions to $Lu = f$ exist and are Holder continuous. As is well known, weak solutions live in Sobolev spaces, hence their importance to the subject.
Now, suppose we relax the condition on the quadratic form. Call $Lu = f$ degenerate elliptic if we have only that $$0 \leq \mathcal{Q}(x,\xi) \leq C|\xi|^2.$$ In particular, we say that $Q$ degenerates at $x \in \Omega$ if there exists $\xi \neq 0$ such that $\mathcal{Q}(x,\xi) = 0$. As it turns out, allowing that the quadratic form vanish causes major difficulties in adapting the theory of weak solutions. Weak solutions still exist, but the do NOT live, any longer, in Sobolev spaces. Instead, as has been proved by Sawyer and Wheeden (2009) and Rodney (2012), they reside in degenerate Sobolev spaces, which are quite a bit more difficult to deal with.
These spaces are defined with reference to the particular matrix $Q$ with which we are working. So given such a matrix $Q$, define the (possibly infinite) norm $$||w||_{QH^{1,p}(\Omega)} = \left( ||w||_p^p + \int_\Omega |\nabla w^\top Q(x) \nabla w|^\frac{p}{2} dx\right)^\frac{1}{p}$$ on $Lip_{loc}(\Omega)$, the space of locally Lipschitz functions. Note that the gradients of such functions exist almost everywhere by the Rademacher-Stepanov theorem. We define the degenerate Sobolev space $QH^{1,p}(\Omega)$ as the completion of the $$\{w \in Lip_{loc}(\Omega) : ||w||_{QH^{1,2}(\Omega)} < \infty\}$$ analogously to the (secondary) definition of classical Sobolev spaces, but the the gradient norm weighted to the matrix $Q$.
As a remark, when you are reading about this subject in Sawyer and Wheeden's paper, for example, they call the spaces as defined this way $W^{1,p}_Q (\Omega)$, even though this goes against the names given to the classical spaces.