Cool question!
Thanks to user lionelbrits for his answer that prompted me to pull out my mechanics books and check the definitions of "canonical transformation" given by different authors.
If you look in Goldstein's classical mechanics texts in the section on canonical transformations, then you'll find that canonical transformations are essentially defined as follows (I paraphrase)
Goldstein Definition: A transformation $f:\mathcal P\to\mathcal P$ on phase space $\mathcal P$ is canonical provided there exists a phase space function $K$ such that if $(q(t), p(t))$ is a solution to Hamilton's equations generated by $H$, then $(Q(t), P(t)) = f(q(t), p(t))$ is a solution to Hamilton's equations generated by $K$.
This is essentially the definition given by lionelbrits in his answer.
On the other hand, if you look, for example, in Spivak's mechanics text, then you'll find the following definition:
Spivak's Definition: A transformation $f:\mathcal P \to \mathcal P$ on phase space is canonical provided it preserves the symplectic form.
In more concrete terms (namely in canonical coordinates), Spivak's definition can be stated as follows:
The transformation $f(q,p) = (f^q(q,p), f^p(q,p))$ is canonical if and only if its Jacobian (derivative) matrix preserves the symplectic matrix $J$, namely
\begin{align}
f'(p,q)\,J\,f'(p,q)^t = J
\end{align}
where
\begin{align}
J=\begin{pmatrix}
0 & I_n \\
-I_n & 0 \\
\end{pmatrix},\qquad
f' = \begin{pmatrix}
\frac{\partial f^q}{\partial q} & \frac{\partial f^q}{\partial p} \\
\frac{\partial f^p}{\partial q} & \frac{\partial f^p}{\partial p} \\
\end{pmatrix}
\end{align}
where $2n$ is the dimension of phase space and $I_n$ is the $n\times n$ identity matrix.
It also turns out that
If a transformation is canonical in the sense defined by Spivak, then it is canonical is the sense of Goldstein with $K = H\circ f^{-1}$
but the converse is not true. In fact, this example you brink up is a counterexample to the converse! What lionelbrit showed in his answer is that the example you have written is a canonical transformation in the sense of Goldstein, but, as you should try to convince yourself (I did), the function $K = H\circ f^{-1}$ that you wrote down by inverting the transformation and plugging back into $H$ leads to Hamilton's equations that are not satisfied by $(Q(t), P(t)) = f(q(t), p(t))$. You can show this directly by writing down the equations of motion. You can also show this by computing the Jacobian of the transformation and showing that it does not preserve the symplectic matrix. In fact, you should find that the Jacobian is given by
\begin{align}
f'(q,p)=\begin{pmatrix}
1 & 0 \\
-\frac{1}{2\sqrt{q}} & \frac{1}{2\sqrt{p}} \\
\end{pmatrix}
\end{align}
and that
\begin{align}
f'(q,p) J f'(q,p)^t = \frac{1}{2\sqrt{p}} J
\end{align}
In other words, the Jacobian of the transformation preserves the symplectic matrix up to a multiplicative factor.
Speculation. I'm going to go out on a limb and guess that your professor calls Goldstein's definition a "local canonical transformation" and Spivak's definition a "canonical transformation." If we adopt this terminology, then it's clear from our remarks that the $K$ he gives shows that your example is a local canonical transformation, but that the transformation is not canonical.
You should really think about the variables we use as being like coordinates on some manifold, the configuration space (roughly the same as the phase space, I won't be careful about the distinction). In this language, changing variables is equivalent to changing coordinates on this manifold. The action is some scalar function on this space, and we can take coordinate derivatives, $\frac{\delta S}{\delta q_a}$ with respect to whatever coordinates $q_a$ we are using on the space. As in multivariable calculus, we can form the directional derivatives $D_v S = v_a\frac{\delta S}{\delta q_a}$ for any vector with components $v_a$ we like. If you want something more formal and geometric, the directional derivative is a Lie derivative on the configuration manifold.
Now, remember that when we vary the action, we demand that the variation be stationary. That is, we demand that all directional derivatives vanish, meaning $D_v S=0$ for all vectors $v$. This statement about the vanishing of directional derivatives, you'll note, it entirely agnostic to the coordinates we use, but nonetheless implies that if we use coordinates $q_a$, that $\frac{\delta S}{\delta q_a}=0$ for each $a$. But any coordinate system would result in the same condition that all the coordinate derivatives vanish. This should also be in some sense familiar from multivariable calculus.
So this is casting the invariance of the Euler-Lagrange equations in a geometric language. Aside from being nice, this is also going to be the right language to understand what's going on in the Hamiltonian picture.
The phase space is normally coordinatized by pairs of coordinates $(p^a,q_a)$, but this is really not necessary. At the end of the day, the phase space is again a manifold and the $(p,q)$ are simply a special coordinate system on that manifold (Darboux's theorem implies that we can always, at least locally, find such a coordinate system). The thing that defines these special coordinates, really, is that the symplectic form takes a very special form.
In case you are less than familiar with symplectic forms, let me do the following to motivate the idea. Instead of using the coordinates $(p^a,q_a)$, instead use a collective coordinate $\xi^a = \langle p^a, q_a\rangle$, so all I've really done is put the $p$'s and $q$'s into one big vector. Just to be clear, if $q_a$ and $p^a$ were $n$-dimensional vectors, then $\xi^a$ is a $2n$-dimensional vector formed by concatinating the components (well, any way of putting the components together will do...this will just change the precise form of the $\Omega$ I introduce in a moment by permuting it's rows and columns appropriately). In terms of this, the Hamilton's equations may now be written
$$
\frac{d\xi^a}{d t}=\Omega^{ab}\frac{\partial H}{\partial \xi^b}
$$
where $\Omega$ is some matrix. Usually, this looks something like
$$
\Omega=\left(\begin{array}{cc}
0 & -1\\
1 & 0\end{array}\right).
$$
This $\Omega$ is typically known as the inverse of the symplectic form. Though sometimes you'll just hear it called the symplectic form (an abuse of language, but not uncommon) or, more accurately, a Poisson bivector. These names are not so important to what I want to say, but I figure I may as well mention the correct terminology for anyone who wants to try searching online themselves.
Now, the symplectic form does, in fact, transform under coordinate changes just like you might expect any tensorial object over a manifold to do. If we take on faith that the symplectic form should transform as a tensor under coordinate changes, then we already know how the right-hand side of the rewritten Hamilton's equation transforms if we were to transform to some other coordinate system. Let's investigate the left-hand side.
Suppose we perform some transformation $\xi^a=\xi^a(\zeta)$ to a new coordinate system $\zeta$. Then by chain rule,
$$
\frac{d \xi^a}{d t}=\frac{\partial \xi^a}{\partial \zeta ^\mu}\frac{d\zeta^\mu}{d t},
$$
so we see, perhaps unsurprisingly the time derivative also transforms as a tensor under a coordinate change (I used $\mu$ for the indices of the new coordinates just to keep things visually distinct).
So in the end, we find that Hamilton's equations transform as
$$
\frac{d \xi^a}{d t}=\Omega^{ab}\frac{\partial H}{\partial \xi^b}\implies \frac{\partial \xi^a}{\partial \zeta ^\mu}\frac{d\zeta^\mu}{d t} = \Omega^{ab}\frac{\partial \zeta^\nu}{\partial \xi^b}\frac{\partial H}{\partial \zeta^\nu},
$$
which if we move the Jacobian on the left to the right, we find
$$
\frac{d\zeta^\mu}{dt}=\left(\frac{\partial\zeta^\mu}{\partial \xi^a}\Omega^{ab}\frac{\partial\zeta^\nu}{\partial\xi^b}\right)\frac{\partial H}{\partial \zeta^\nu}
$$
and hence we see that if we define a new symplectic form $\Omega^\prime$ by
$$
\Omega^{\prime\, \mu\nu}=\frac{\partial\zeta^\mu}{\partial \xi^a}\Omega^{ab}\frac{\partial\zeta^\nu}{\partial\xi^b},
$$
(which is equivalent to my assertion that $\Omega$ should be tensorial under coordinate changes) Hamilton's equation actually are invariant in the sense that we still have equations of the form
$$
\frac{d \zeta^\mu}{dt}=\Omega^{\prime\, \mu\nu}\frac{\partial H}{\partial\zeta^\nu}.
$$
The only difference is that the components of $\Omega$ and $\Omega^\prime$ might not be the same. But really this shouldn't be such a shock after all this setup since we wouldn't expect the components of a tensor to remain the same after a coordinate change.
Consider as an example the Minkowski metric. We know what this looks like in Cartesian coordinates. If we changed to polar coordinates, for example, of course the component entries in the metric change, but it's still, in a very real sense, the same metric. We just have a new representation thereof.
So where to canonical transformations fit into all this? They are simply the very special coordinate transformations which actually leave the components of the symplectic form invariant. Formally, these are coordinate transformations generated by vector fields over phase space whose Lie derivative of the symplectic form vanishes. This is very similar in many respects to a vector field being a Killing vector of some metric.
Finally, I should point out that the way I have framed the entire discussion above, it may seem strange why we should consider canonical transformations at all. After all, we can use any transformation at the cost of a nice form for the symplectic form. Perhaps non-canonical transforms can put the equations in a nice form.
In principle this is of course true. However canonical transformations play a very essential role which is intimately tied to Noether's theorem and symmetry. Essentially one can guarantee that every symmetry of the action corresponds to precisely a canonical transformation. Furthermore, only vector fields which correspond to canonical transformations are guaranteed to have a charge associated to them (like the Hamiltonian is the charge associated to time evolution (which is itself a canonical transformation)).
Best Answer
I think your solution is basically correct.
Part (a)
To find the missing transformations of the momenta, we first try to find a generating function $\cal F_2(q, P)$ that generates the known transformations of the coordinates. Then, we use this generating function $\cal F_2(q, P)$ to compute the relations regarding the momenta.
The transformation of coordinates $Q^i = Q^i(q)$ can be conveniently generated by the generating function of type 2 as \begin{align} \cal F_2(q, P) &=\sum_i P_i \, Q^i(q) + F(q), \end{align} where $F(q)$ is arbitrary function of $q$.
In this way, the requirement \begin{align} \frac{ \partial \cal F_2(q, P) }{ \partial P_i} = Q^i(q). \end{align} is automatically satisfied.
In our case \begin{align} \cal F_2(q, P) &= P_1 \, Q^1(q^1, q^2) + P_2 \, Q^2(q^1, q^2) - F \\ &= P_1 \, (q^1)^2 + P_2 \, (q^1 + q^2) - F, \end{align} where $F \equiv F(q^1, q^2)$ is an arbitrary function of $q^1$ and $q^2$.
So \begin{align} p_1 &= \frac{ \partial \cal F_2(q, P) }{ \partial q^1 } = 2 P_1 \, q^1 + P_2 - \frac{\partial F }{\partial q^1}, \\ p_2 &= \frac{ \partial \cal F_2(q, P) }{ \partial q^2 } = P_2 - \frac{\partial F }{\partial q^2}. \end{align}
Or \begin{align} P_1 &= \frac{1}{2q^1} \left( p_1 + \frac{ \partial F } { \partial q^1 } -p_2 - \frac{ \partial F } { \partial q^2 } \right) \tag{1} \\ P_2 &= p_2 + \frac{ \partial F } { \partial q^2 }. \tag{2} \end{align}
Part (b)
Basically we need to find an $F$ such that $K$ matches $H$, because $$ d{\cal F}_2 = p \, dq + Q dP + (K - H) \, dt, $$ and our $F_2$ does not depend on time explicitly (so $K-H$ must vanish).
Now by the solution of part (a), we have
\begin{align} H &= \left( \frac{p_1 - p_2}{2q^1} \right)^2 + p_2 + (q^1 + q^2)^2, \\ K &= P_1^2 + P_2 \\ &= \left( \frac{p_1 - p_2 + \partial F/\partial q^1 - \partial F/\partial q^2}{2q^1} \right)^2 + p_2 + \partial F / \partial q^2. \end{align}
It would be nice if \begin{align} \partial F/\partial q^1 &= \partial F/\partial q^2, \\ \partial F/\partial q^2 &= (q^1 + q^2)^2. \end{align}
A simple solution would be \begin{align} F = \frac{1}{3} (q^1 + q^2)^3. \end{align}
Then Eq. (1) and (2) means \begin{align} P_1 &= \frac{1}{2q^1} \left( p_1 - p_2 \right) \tag{1} \\ P_2 &= p_2 + (q^1 + q^2)^2. \tag{2} \end{align}
The result $P_1 = (p_1 + p_2)/(2q^1)$ doesn't make sense, because it implies $\dot P_1 \ne 0 = -\partial K/\partial Q^1$. So the plus sign might be a typo.