The key point is to write $B_T^2$ in a clever way:
$$\begin{align*} B_T^2 &= \sum_{j=1}^n (B_{t_j}^2-B_{t_{j-1}}^2) \\ &= \sum_{j=1}^n \big( (B_{t_j}-B_{t_j^{\ast}})+B_{t_j^{\ast}} \big)^2 - \big( (B_{t_{j-1}}-B_{t_j^{\ast}})+B_{t_j^{\ast}} \big)^2 \\ &= 2 \sum_{j=1}^n B_{t_j^{\ast}} (B_{t_j}-B_{t_{j-1}}) + \sum_{j=1}^n (B_{t_j}-B_{t_j^{\ast}})^2- \sum_{j=1}^n (B_{t_{j-1}}-B_{t_j^{\ast}})^2 \tag{1} \end{align*}$$
where $t_j^{\ast}:= \frac{1}{2}(t_j+t_{j-1})$. Now, by definition, the first term on the right-hand side converges to the Stratonovich integral
$$\int_0^T B_t \circ dB_t.$$
Consequently, it remains to show that the remaining terms converge to $0$ as the mesh size $|\Pi|$ converges to $0$. To this end, we set
$$S_n := \sum_{j=1}^n (B_{t_j}-B_{t_j^{\ast}})^2$$
and note that by the stationarity of the increments
$$\mathbb{E}(S_n) = \mathbb{E} \left( \sum_{j=1}^n (B_{t_j}-B_{t_j^{\ast}})^2 \right) = \sum_{j=1}^n (t_j-t_j^{\ast}) = \frac{T}{2}.$$
Consequently, by the independence of the increments,
$$\begin{align*} \mathbb{E} \left[ \left( \sum_{j=1}^n (B_{t_j}-B_{t_j^{\ast}})^2 - \frac{T}{2} \right)^2 \right] &= \text{Var} \, (S_n) \\ &= \sum_{j=1}^n \text{Var} \, \left[ (B_{t_j}-B_{t_j^{\ast}})^2 - (t_j-t_j^{\ast}) \right]. \end{align*}$$
Using the scaling property, i.e. $B_t \sim \sqrt{t} B_1$ for any $t \geq 0$, it is therefore not difficult to see that this expression converges to $0$ as $|\Pi| \to 0$. In fact, we find that
$$\mathbb{E} \left[ \left( \sum_{j=1}^n (B_{t_j}-B_{t_j^{\ast}})^2 - \frac{T}{2} \right)^2 \right] \leq C|\Pi| T$$
for $C:= \mathbb{E}((B_1^2-1)^2)<\infty$. Hence,
$$\sum_{j=1}^n (B_{t_j}-B_{t_j^{\ast}})^2 \stackrel{L^2}{\to} \frac{T}{2}.$$
Exactly the same calculation goes through for the third addend in $(1)$. Finally, we conclude
$$B_T^2 = 2\int_0^T B_t \circ dB_t + \frac{T}{2} - \frac{T}{2} = 2\int_0^T B_t \circ dB_t.$$
It's enough that $f$ be Lebesgue measurable and $\int_0^t [f(s)]^2\,ds<\infty$ for each $t$. Indeed, in this case your stochastic integral is well defined. Moreover, if we set $A_t:=\int_0^t[f(s)]^2\,ds$, then $Y$ is a (mean zero) continuous martingale with quadratic variation $\langle Y\rangle_t=A_t$.
Let $\tau(t):=\inf\{s:A_s>t\}$. Then $Z_t:=Y_{\tau(t)}$ is a continuous martingale with quadratic variation $t$; by Levy's theorem, $Z$ is Brownian motion. Because $A$ is deterministic, $Y_t=Z_{A_t}$, $t\ge 0$, is a deterministic re-parametrization of a Gaussian process, and is therefore also a Gaussian process.
Best Answer
Note that $G_{\frac{t_{i+1}+t_i}{2}} := \frac{G_{t_{i+1}}+G_{t_i}}{2}$, such that $$ \mathbb{E}\left[\int_{t_0}^tG_S\,\mathrm{d}W_s\right] = \lim_{h\rightarrow0} \frac{1}{2} \sum_i \mathbb{E}[(G_{t_{i+1}}+G_{t_i})(W_{t_{i+1}}-W_{t_i})]. $$ Now, let's recall that the increments of a Wiener process are independent of past values, hence $\mathbb{E}[G_{t_i}(W_{t_{i+1}}-W_{t_i})] = 0$ but $\mathbb{E}[G_{t_{i+1}}(W_{t_{i+1}}-W_{t_i})] \neq 0$, because $W_{t_{i+1}}-W_{t_i}$ is the next increment with respect to $G_{t_i}$ but the previous one with respect to $G_{t_{i+1}}$.