Perhaps you should study some more advanced analysis, since that's when Frechet derivatives come up. A good (and legally free) reference is Applied Analysis by John Hunter and Bruno Nachtergaele.
After that, perhaps Analysis by Elliott H. Lieb and Michael Loss? It's more advanced, so be sure you understand Hunter and Nachtergaele first.
For more intense partial differential equations, UC Davis' upper division PDE's courses are available online too (with homework and solutions) when Nordgren taught it. There is Math 118A: Partial Differential Equations and 118B.
There are probably more advanced (free) references out there, but these are the ones I use...
Addendum: For references on specifically functional analysis, perhaps you should be comfortable with Eidelman et al's Functional Analysis. An introduction (Graduate Studies in Mathematics 66; American Mathematical Society, Providence, RI, 2004); I've heard good things about J.B. Conway's Functional Analysis, although I have yet to read it...
First, you should not believe in anything in mathematics, in particular weak solutions of PDEs. They are sometimes a useful tool, as others have pointed out, but they are often not unique. For example, one needs an additional entropy condition to obtain uniqueness of weak solutions for scalar conservation laws, like Burger's equation. Also note that there are compactly supported weak solutions of the Euler equations, which is absurd (a fluid that starts at rest, no force is applied, and then it does something crazy and comes back to rest). They are a useful tool, connected to physics sometimes, but that is it.
In general, it is naive to ignore applications when studying or looking for motivations for theoretical objects in PDEs. Nearly all applications of PDEs are in physical sciences, engineering, materials science, image processing, computer vision, etc. These are the motivations for studying particular types of PDEs, and without these applications, there would be almost zero mathematical interest in many of the PDEs we study. For instance, why do we spend so much time studying parabolic and elliptic equations, instead of focusing effort on bizarre fourth order equations like $u_{xxxx}^\pi = u_y^2e^{u_z}$? (hint: there are physical applications of elliptic and parabolic equations). We study an extremely small sliver of all possible PDEs, and without a mind towards applications, there is no reason to study these PDEs instead of others.
You say you do not know anything about physics; well I would encourage you to learn about some physics and connections to PDEs (e.g., heat equation or wave equation) before learning about theoretical properties of PDEs, like weak solutions.
PDEs are only models of the physical phenomenon we care about. For example, consider conserved quantities. If $u(x,t)$ denotes the density (say heat content, or density of traffic along a highway) of some quantity along a line at position $x$ and time $t$, then if the quantity is truly conserved, it satisfies (trivially) a conservation law like
$$\frac{d}{dt} \int_a^b u(x,t) \, dx = F(a,t) - F(b,t), \ \ \ \ \ (*)$$
where $F(x,t)$ denotes the flux of the density $u$, that is, the amount of heat/traffic/etc flowing to the right per unit time at position $x$ and time $t$. The equation simply says that the only way the amount of the substance in the interval $[a,b]$ can change is by the substance moving into the interval at $x=a$ or moving out at $x=b$.
The function $u$ need not be differentiable in order to satisfy the equation above. However, it is often more convenient to assume $u$ and $F$ are differentiable, set $b = a+h$ and send $h\to 0$ to obtain (formally) a differential equation
$$\frac{\partial u}{\partial t} + \frac{\partial F}{\partial x} = 0. \ \ \ \ \ (+)$$
This is called a conservation law, and we can obtain a closed PDE by taking some physical modeling assumption on the flux $F$. For instance, in heat flow, Newton's law of cooling says $F=-k\frac{\partial u}{\partial x}$ (or for diffusion, Fick's law of diffusion is identical). For traffic flow, a common flux is $F(u)=u(1-u)$, which gives a scalar conservation law.
Whatever physical model you choose, you have to understand that (*) is the real equation you care about, and (+) is just a convenient way to write the equation. It would seem absurd to say that if one cannot find a classical solution of (+), then we should throw up our hands and admit defeat.
Most applications of PDEs, such as optimal control, differential games, fluid flow, etc., have a similar flavor. One writes down a function, like a value function in optimal control, and the function is in general just Lipschitz continuous. Then one wants to explore more properties of this function and finds that it satisfies a PDE (the Hamilton-Jacobi-Bellman equation), but since the function is not differentiable we look for a weak notion of solution (here, the viscosity solution) that makes our Lipschitz function the unique solution of the PDE. This point is that without a mind towards applications, one is shooting in the dark and you will not find elegant answers to such questions.
Best Answer
It is well-known (see Evans Ch.2) that solutions of parabolic equations are not unique without some growth conditions at $\infty$. For the linear heat equation, it is required that $|u(x,t)|\leq e^{A|x|^2}$, so polynomial and exponential growth is allowed. So it is not unusual to restrict to, say, polynomial growth solutions, since this would imply existence and uniqueness. Functions with polynomial growth constitute a very wide class of functions in the context of solving PDEs. Most of the time, one has to place far stronger growth conditions, say bounded, or linear growth, to get uniqueness of viscosity solutions (in the nonlinear setting).
ADDITION: When the growth bounds are violated, solutions are not unique. For the heat equation, there are infinitely many solutions that violate the growth condition $|u(x,t)|\leq e^{A|x|^2}$. These solutions are so large as $x\to \infty$ that heat flows very quickly from $\infty$ in towards the origin and the solution exhibits finite time blow up. So the solutions are poorly behaved and do not represent the physical phenomenon being modeled (e.g., heat flow), so are normally discarded as being "non-physical".