First I believe you mean you want a European option with payoff of $\max(S^2_T- K, 0)$, not $\max(S^2_T, K)$. As to why the solution seems so specific and not very general, that book is for job interview questions, so it assumes the reader is very comfortable with its contents already and wants solutions that are fast, easy to remember, and do not need a lot of complex calculations.
For the first question, $S_t = S_0 \cdot \exp((r - \frac{\sigma^2}{2})t + \sigma W_t)$ is the solution to Geometric Brownian motion for any time $t$. Nothing says if we want $S_t^2$ we can't just square both sides giving $S_t^2 = S_0^2 \exp((2r -\sigma^2)t + 2\sigma W_t))$. We could also find this using Ito's lemma with $f(t, x) = x^2$. To see this let $S_t$ follow
$$dS_t = rS_tdt + \sigma S_tdW_t.$$
Using $f(t, x) = x^2$ by Ito's lemma we have
$$df = (2rf + \sigma^2 f)dt + 2 \sigma fdW_t.$$
Then if we let $x = S_t$ we have
$$dS_t^2 = (2rS_t^2 + \sigma^2 S_t^2)dt + 2 \sigma S_t^2dW_t.$$
This then has the solution for $S_t^2$ given above.
For the second question $F_T(0)$ is the forward price of $S_t$ at time $t=0$ and $\nu$ would be equal to $2\sigma$. The use of the Black model is not very intuitive and most derivations for this do not use that approach. I think one of the more intuitive approaches would be to see that $S_t^2$ is a Geometric Brownian motion with drift $2r + \sigma^2$ and diffusion $2\sigma$ and use the Black-Scholes formula with the proper adjustments. The adjustments would be use the Black-Scholes formula with dividends and in it let $\hat{\sigma} = 2\sigma$, $\hat{r} = r$, and $\hat{q} = -(\sigma^2 + r)$.
As far as a more general framework for option pricing your best bet is to consult a book on it. Two that come to mind are Martingale Methods in Financial Modelling by Musiela and Rutkowski or Arbitrage Theory in Continuous Time by Bjork, but almost any will suffice they tend to just vary in how deep mathematically they go.
The Black-Scholes call option price is the discounted expected value of the payoff $\max(S_T-X,0$) under the risk-neutral probability measure. The existence of such a measure is guaranteed with a market model (such as Black-Scholes) that is arbitrage-free, and a negative option price represents an (impossible) arbitrage opportunity.
However, even if you knew nothing about arbitrage pricing theory you could prove that $C(t,S_t) > 0$ for all $S_t >0$ and for all $0 \leqslant t < T$ directly from the formula. The inequality
$$S_t < Xe^{-r(T-t)}\frac{N(d_1)}{N(d_2)},$$
which ostensibly yields a negative option price can never hold. Note that the ratio $N(d_1)/N(d_2)$ is not a constant but rather depends nonlinearly on the ratio $S_t/Xe^{-r(T-t)}$.
Consider the price $C(t,S) = SN(d_1) - Xe^{-r(T-t)}N(d_2)$ treating $S_t = S$ as a fixed parameter. We have
$$\lim_{t \to T}d_1 = \lim_{t \to T}d_2 = \begin{cases}+\infty, & S> X\\0,&S = X\\-\infty, &S < X\end{cases}, \quad\lim_{t \to T}N(d_1) = \lim_{t \to T}N(d_2) = \begin{cases}1, & S> X\\\frac{1}{2},&S = X\\0, &S < X\end{cases}$$
and, thus, $\lim_{t\to T}C(t,S) =\max(S-X,0) \geqslant 0$.
Taking the partial derivative of the option price with respect to $t$, we find
$$\frac{\partial C}{\partial t}(t,S) = S_t N'(d_1) \frac{\partial d_1}{\partial t} - Xe^{-r(T-t)}N'(d_2)\frac{\partial d_2}{\partial t} - rXe^{-r(T-t)}N(d_2)\\= -\frac{S\sigma \phi(d_1)}{2\sqrt{T-t}}-rXe^{-r(T-t)}N(d_2)<0$$
Since the partial derivative is strictly less than $0$, the option price is a decreasing function of time $t$ for each fixed value of $S$.
Hence, the option price decreases to the limit as $t$ increases to $T$, that is
$$C(t,S) \downarrow \max(S-X,0) \geqslant 0 \, \text{as} \,\, t\to T,$$
and it follows that $C(t,S) > 0$ strictly for all $t <T$.
Best Answer
It comes from simply rewriting the expression $S_T > K$. The following inequalities are all equivalent: \begin{align*} S_T &> K \\ S_0 \exp\left( \left(r-\frac{\sigma^2}{2}\right)T + \sigma \sqrt{T}Z\right) &> K \\ \left(r-\frac{\sigma^2}{2}\right)T + \sigma \sqrt{T}Z &> \ln\left(\frac{K}{S_0}\right) \\ Z &> \frac{\ln (K/S_0) - \left(r-\frac{\sigma^2}{2}\right)T}{\sigma \sqrt T} \\ Z &> -\left( \frac{\ln (S_0/K) + \left(r+\frac{\sigma^2}{2}\right)T}{\sigma \sqrt T}\right) + \sigma \sqrt T. \end{align*} Since $$d+ = \frac{\ln (S_0/K) + \left(r+\frac{\sigma^2}{2}\right)T}{\sigma \sqrt T},$$ we have that $1_{S_T > K} = 1_{Z > -d_+ + \sigma \sqrt T}$.