This is a bit late, but I see that the main points in this question have not been completely addressed. I'll set
\begin{equation}
\sigma = 1
\end{equation}
for this answer.
The definition of white noise may be context-dependent: How you define it depends on what you want to do with it. There's nothing inherently wrong with saying that white noise (indexed by a set $T$) is just the process of iid standard normal random variables indexed by $T$, i.e. $E[X(t)X(s)] = \begin{cases} 1 & t = s \\ 0 & t \neq s \end{cases}.$ However, as cardinal noted here, Example 1.2.5 of Kallianpur's text shows that this process is not measurable (as a function of $(t, \omega)$). This is why, as Did commented above, $Y$ is undefined (with this definition of $X$). Thus, this definition of white noise is not appropriate for defining objects like $Y$.
Rather, you want $X$ to have covariance given by the Dirac delta. But the $\delta$ function is not a function but rather a measure and the best context for understanding it is the theory of distributions (or generalized functions---these are not to be confused with "probability distributions"). Likewise, the appropriate context for white noise is the theory of random distributions.
Let's warm up with a heuristic explanation: We'll think of white noise as the "derivative" of Brownian motion: "$dB_t/dt = X_t$". So ignoring rigor for a moment, we could write
\begin{equation}
\int_0^T h(t) X(t) dt = \int_0^T h(t) \frac{dB_t}{dt} dt = \int_0^T h(t) dB_t.
\end{equation}
The reason this isn't rigorous is that Brownian motion is nowhere differentiable. However, the theory of distributions allows us to "differentiate" non-differentiable functions. First of all, a distribution is a linear functional (linear map taking values in the real numbers) on a space of "test functions" (usually smooth functions of compact support). A continuous function $F$ can be viewed as a distribution via the pairing
\begin{equation}
(F, f) = \int_0^\infty F(t) f(t) dt.
\end{equation}
The distributional derivative of $F$ is the distribution $F'$ whose pairing with a test function $f$ is defined by
\begin{equation}
(F', f) = -(F, f').
\end{equation}
Thinking of Brownian motion as a random function, we can define white noise $X$ as its distributional derivative. Thus, $X$ is a random distribution whose pairing with a test function $f$ is the random variable
\begin{equation}
(X, f) = -(B, f') = -\int_0^\infty B(t) f'(t) dt.
\end{equation}
By stochastic integration by parts,
\begin{equation}
(X, f) = \int_0^\infty f(t) dB_t;
\end{equation}
this is the Itô integral of $f$ with respect to $B$.
Now a well-known fact in stochastic calculus is that $M_T = \int_0^T f(t) dB_t$ is a martingale starting at $M_0 = 0$, so $E (X, f) = 0$. Moreover, by the Itô isometry,
\begin{equation}
\mathrm{Var}((X, f)) = E (X, f)^2 = \int_0^\infty f(t)^2 dt.
\end{equation}
It can also be verified that $(X, f)$ is Gaussian.
My main point is that a more appropriate definition of $Y$ might be
\begin{equation}
Y = \int_0^T h(t) dB_t.
\end{equation}
As a last note, because of the way $X$ was defined above, $X_t$ is not defined but $(X, f)$ is. That is, $X$ is a stochastic process but whose index set is given by $T = \{ \text{test functions} \}$ rather than $T = [0, \infty)$. Moreover, again by the Itô isometry,
\begin{equation}
E (X, f) (X, g) = \int_0^\infty f(t) g(t) dt.
\end{equation}
Abandoning rigor again, this becomes
\begin{equation}
E (X, f) (X, g) = \int_0^\infty \int_0^\infty f(s) \delta(s - t) g(t) ds dt
\end{equation}
and it is in this sense that the covariance of $X$ is the Dirac delta.
Edit: Note that we could leave the definition of $(X, f)$ in terms of the ordinary integral and do all the above calculations using Fubini's theorem and (ordinary) integration by parts (it's just a bit messier).
Since the process is Gaussian and one-dimensional, its probability density function (PDF) must be
$$f_X(x)=\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}}\tag{1}$$
where $\mu$ is the mean, and $\sigma^2$ is the variance. From the problem statement we know that $\mu=0$, so we only need to find $\sigma^2$.
From the definition of the auto-correlation function we know that
$$R_x(0)=E[X^2(t)]=\sigma^2+\mu^2=\sigma^2\tag{2}$$
where the last equality follows from the fact that $\mu=0$. From $(2)$ and the given auto-correlation function we know that $\sigma^2=R_x(0)=4$, so we finally get for the PDF
$$f_X(x)=\frac{1}{2\sqrt{2\pi}}e^{-\frac{x^2}{8}}\tag{3}$$
Best Answer
If $X_t$ and $X_{t+\tau}$ are jointly Gaussian with means $0$, variances $\sigma^2$ and covariance $c$, then $\text{Cov}(X_t^2, X_{t+\tau}^2) = 2 c^2$. One way to get this is from Isserlis's theorem.