This is a bit late, but I see that the main points in this question have not been completely addressed. I'll set
\begin{equation}
\sigma = 1
\end{equation}
for this answer.
The definition of white noise may be context-dependent: How you define it depends on what you want to do with it. There's nothing inherently wrong with saying that white noise (indexed by a set $T$) is just the process of iid standard normal random variables indexed by $T$, i.e. $E[X(t)X(s)] = \begin{cases} 1 & t = s \\ 0 & t \neq s \end{cases}.$ However, as cardinal noted here, Example 1.2.5 of Kallianpur's text shows that this process is not measurable (as a function of $(t, \omega)$). This is why, as Did commented above, $Y$ is undefined (with this definition of $X$). Thus, this definition of white noise is not appropriate for defining objects like $Y$.
Rather, you want $X$ to have covariance given by the Dirac delta. But the $\delta$ function is not a function but rather a measure and the best context for understanding it is the theory of distributions (or generalized functions---these are not to be confused with "probability distributions"). Likewise, the appropriate context for white noise is the theory of random distributions.
Let's warm up with a heuristic explanation: We'll think of white noise as the "derivative" of Brownian motion: "$dB_t/dt = X_t$". So ignoring rigor for a moment, we could write
\begin{equation}
\int_0^T h(t) X(t) dt = \int_0^T h(t) \frac{dB_t}{dt} dt = \int_0^T h(t) dB_t.
\end{equation}
The reason this isn't rigorous is that Brownian motion is nowhere differentiable. However, the theory of distributions allows us to "differentiate" non-differentiable functions. First of all, a distribution is a linear functional (linear map taking values in the real numbers) on a space of "test functions" (usually smooth functions of compact support). A continuous function $F$ can be viewed as a distribution via the pairing
\begin{equation}
(F, f) = \int_0^\infty F(t) f(t) dt.
\end{equation}
The distributional derivative of $F$ is the distribution $F'$ whose pairing with a test function $f$ is defined by
\begin{equation}
(F', f) = -(F, f').
\end{equation}
Thinking of Brownian motion as a random function, we can define white noise $X$ as its distributional derivative. Thus, $X$ is a random distribution whose pairing with a test function $f$ is the random variable
\begin{equation}
(X, f) = -(B, f') = -\int_0^\infty B(t) f'(t) dt.
\end{equation}
By stochastic integration by parts,
\begin{equation}
(X, f) = \int_0^\infty f(t) dB_t;
\end{equation}
this is the Itô integral of $f$ with respect to $B$.
Now a well-known fact in stochastic calculus is that $M_T = \int_0^T f(t) dB_t$ is a martingale starting at $M_0 = 0$, so $E (X, f) = 0$. Moreover, by the Itô isometry,
\begin{equation}
\mathrm{Var}((X, f)) = E (X, f)^2 = \int_0^\infty f(t)^2 dt.
\end{equation}
It can also be verified that $(X, f)$ is Gaussian.
My main point is that a more appropriate definition of $Y$ might be
\begin{equation}
Y = \int_0^T h(t) dB_t.
\end{equation}
As a last note, because of the way $X$ was defined above, $X_t$ is not defined but $(X, f)$ is. That is, $X$ is a stochastic process but whose index set is given by $T = \{ \text{test functions} \}$ rather than $T = [0, \infty)$. Moreover, again by the Itô isometry,
\begin{equation}
E (X, f) (X, g) = \int_0^\infty f(t) g(t) dt.
\end{equation}
Abandoning rigor again, this becomes
\begin{equation}
E (X, f) (X, g) = \int_0^\infty \int_0^\infty f(s) \delta(s - t) g(t) ds dt
\end{equation}
and it is in this sense that the covariance of $X$ is the Dirac delta.
Edit: Note that we could leave the definition of $(X, f)$ in terms of the ordinary integral and do all the above calculations using Fubini's theorem and (ordinary) integration by parts (it's just a bit messier).
The disparity arises from the fact that your discretization of the continuous process does not assign the appropriate variance to the $X_i$. Here's the key (for heuristics, see here, Section 3.2):
If $\{X(t)\}_{t\in\mathbb{R}}$ is a continuous Gaussian process such that
$$\begin{align}&E[X(t)]=0\\
&E[X(t)X(t')]=\sigma^2 \delta(t-t')\end{align}
$$ then a discrete sampling of the process, with a uniform sampling interval $\Delta$, viz.,
$$X(t),X(t+1\Delta),X(t+2\Delta),...$$ is simulated as an i.i.d sequence of Gaussian$(\text{mean}=0,\text{variance}=\frac{\sigma^2}{\Delta})$ random variables, the quality of the simulation increasing as $\Delta\to 0$.
(The power spectral density of the i.i.d. sequence approaches that of the continuous process that it simulates -- flat and equal to the same constant value $\sigma^2$ -- except that for the i.i.d. sequence it is "band limited", i.e. vanishing outside of a finite-width interval.)
Thus, in the present case, to simulate
$$I = \int_0^L X(t) \, dt \approx \sum_{i=1}^N X(i\Delta)\,\Delta
$$
where $\Delta=\frac{L}{N}$, one would use
$$\hat{I} = \Delta\sum_{i=1}^N X_i
$$
with $X_1,X_2,\ldots$ i.i.d. $\text{Gaussian}(\text{mean}=\mu,\text{variance}=\frac{\sigma^2}{\Delta})$. Then
$$\begin{align}
\text{Var}[\hat{I}] = \Delta^2\,N\,\frac{\sigma^2}{\Delta} = (N\Delta)\sigma^2 = L\sigma^2.
\end{align}
$$
Best Answer
Not exactly what you are asking for but if you use the framework developed by Takeyuki Hida, then the White noise process $[0,\infty)\ni t\mapsto w(t)$ can be seen as a $(S)^*$-valued process (i.e. a Hida-distribution-valued family of "stochastic distributions" indexed by time).
In that case the integral $$\int_0^T w(t)dt$$ can be defined rigorously as a Pettis integral in $(S)^*$ and using the fact that formally $w(t)=\frac{d}{dt} B(t)$ where $B$ is a Brownian motion you have that $$\int_0^T w(t)dt=B(T).$$
You can find more discussions about White noise analysis in the following books 1, 2.