We have to assume that the underlying probability space is complete; otherwise the assertion might fail.
So, suppose that $(\Omega,\mathcal{A},\mathbb{P})$ is a complete probability space and $(X_t)_{t \in [0,1]}$ a process with almost surely continuous sample paths, i.e. there exists a null set $N \in \mathcal{A}$ such that $$[0,1] \ni t \mapsto X_t(\omega)$$ is continuous for all $\omega \in \tilde{\Omega} := \Omega \backslash N$. Now
$$\tilde{X}_t(\omega) := \begin{cases} X_t(\omega), & \omega \in \tilde{\Omega}, \\ 0, & \omega \in N \end{cases}$$
defines a stochastic process on $\Omega$ with continuous sample paths, and therefore
$$\sup_{t \in [0,1]} \tilde{X}_t = \sup_{t \in [0,1] \cap \mathbb{Q}} \tilde{X}_t$$
is measurable as countable supremum of measurable random variables. On the other hand, we have
$$\tilde{S}_t(\omega) = \sup_{t \in [0,1]} \tilde{X}_t(\omega) = \sup_{t \in [0,1]} X_t(\omega)= S_t(\omega) \quad \text{for all $\omega \in \tilde{\Omega} = \Omega \backslash N$}$$
and so
$$\{S_t \in B\} = \left( \{\tilde{S}_t \in B \} \cap N^c \right) \cup \bigg( \{S_t \in B \} \cap N \bigg)$$
for any Borel set $B$. Since $N \in \mathcal{A}$ and $\tilde{S}_t$ is measurable, we know that
$$\left( \{\tilde{S}_t \in B \} \cap N^c \right) \in \mathcal{A}.$$
Moreover,
$$\left\{ S_t \in B \right\} \cap N \subseteq N$$
and since the probability space is complete, this implies
$$\left\{ S_t \in B \right\} \cap N \in \mathcal{A}.$$
Combining both considerations proves $\{S_t \in B\} \in \mathcal{A}$, and this proves the measurability of $S_t$.
Remark More generally, the following statement holds true in complete probability spaces:
Let $(\Omega,\mathcal{A},\mathbb{P})$ and $(E,\mathcal{B},\mathbb{Q})$ be two measure spaces and assume that $(\Omega,\mathcal{A},\mathbb{P})$ is complete. Let $X, Y: \Omega \to E$ be two mappings. If $X$ is measurable and $X=Y$ almost surely, then $Y$ is measurable.
$\def\si{\sigma}$
$\def\th{\theta}$
Let $M(u,t)=\int_0^t\si(u,r)\,dW_r$ and assume the sample paths of $\si$ have continuous partial derivatives. In this answer, I will repeatedly use the fact that if $\th$ is a bounded variation process, then
$$
\int_0^t\th_r\,dW_r=\th_tW_t-\int_0^tW_r\,d\th_r.\tag{IBP}
$$
Lemma.
$$
\int_0^s M(u,t)\,du = \int_0^s\int_0^t\si(u,r)\,dW_r\,du
= \int_0^t\int_0^s\si(u,r)\,du\,dW_r.
$$
Proof. Fix $s$. Let $Y_t=\int_0^t\int_0^s\si(u,r)\,du\,dW_r$. By (IBP), we have
$$
Y_t = W_t\int_0^s\si(u,t)\,du - \int_0^tW_r\int_0^s\si_2(u,r)\,du\,dr,
$$
where $\si_2$ is the partial derivative of $\si$ with respect to the second argument. For each sample path, the above integrals are ordinary integrals. Thus, by Fubini's theorem,
\begin{align}
Y_t &= W_t\int_0^s\si(u,t)\,du - \int_0^s\int_0^tW_r\si_2(u,r)\,dr\,du\tag{1}\\
&= \int_0^s\left({W_t\si(u,t) - \int_0^tW_r\si_2(u,r)\,dr}\right)\,du.
\end{align}
By (IBP),
$$
\int_0^t W_r\si_2(u,r)\,dr = W_t\si(u,t) - \int_0^t \si(u,r)\,dW_r.\tag{2}
$$
Thus, $Y_t=\int_0^s\int_0^t \si(u,r)\,dW_r\,du$. $\square$
Let $X_t=\int_0^t M(u,t)\,du$. By (1) with $s=t$,
$$
X_t = W_t\int_0^t\si(u,t)\,du - \int_0^t\int_0^tW_r\si_2(u,r)\,dr\,du.
$$
Thus, again using (IBP),
\begin{align}
dX_t &= \int_0^t\si(u,t)\,du\,dW_t
+ W_t\left({
\si(t,t) + \int_0^t\si_2(u,t)\,du
}\right)\,dt\\
&\qquad - \left({
\int_0^tW_r\si_2(t,r)\,dr + \int_0^tW_t\si_2(u,t)\,du
}\right)\,dt\\
&= \int_0^t\si(u,t)\,du\,dW_t
+ \left({
W_t\si(t,t) - \int_0^tW_r\si_2(t,r)\,dr
}\right)\,dt.
\end{align}
By (2) with $u=t$, this gives
\begin{align}
dX_t &= \int_0^t\si(u,t)\,du\,dW_t
+ \int_0^t \si(t,r)\,dW_r\,dt\\
&= \int_0^t\si(u,t)\,du\,dW_t
+ M(t,t)\,dt.
\end{align}
Finally, let
\begin{align}
L(u,t) &= L(u,0) + \int_0^t\mu(u,r)\,dr + \int_0^t\si(u,r)\,dW_r\\
&= L(u,0) + \int_0^t\mu(u,r)\,dr + M(u,t).
\end{align}
Then
\begin{align}
\int_0^t L(u,t)\,du &= \int_0^t L(u,0)\,du
+ \int_0^t\int_0^t\mu(u,r)\,dr\,du + \int_0^t M(u,t)\,du\\
&= \int_0^t L(u,0)\,du + \int_0^t\int_0^t\mu(u,r)\,dr\,du + X_t.
\end{align}
Therefore,
\begin{align}
d\left({\int_0^t L(u,t)\,du}\right)
&= L(t,0)\,dt + \int_0^t\mu(t,r)\,dr\,dt + \int_0^t\mu(u,t)\,du\,dt\\
&\qquad + \int_0^t\si(u,t)\,du\,dW_t + M(t,t)\,dt\\
&= L(t,t)\,dt + \int_0^t\mu(u,t)\,du\,dt
+ \int_0^t\si(u,t)\,du\,dW_t.
\end{align}
This formula has an extra $dW$ term which your formula does not. In hindsight, this seems reasonable. By (1), $\int_0^s M(u,t)\,du$ is a process such that if we fix $t$, then it is bounded variation in $s$, but if we fix $s$, then it is quadratic variation in $t$. Your formula implies that this process is bounded variation on the diagonal $s=t$, which seems fairly counterintuitive. (Edit: I just noticed the phrase, "not including any terms involving the brownian motion", so I guess the formula you posted at the end was intentionally incomplete.)
Edit 2: If we interpret expressions such as $\int_0^t f(u,t)\,dZ_t\,du$ as meaning $\left({\int_0^t f(u,t)\,du}\right)\,dZ_t$, then we have just proved that under suitable assumptions on $\mu$ and $\si$,
$$
d\left({\int_0^t L(u,t)\,du}\right)
= L(t,t)\,dt + \int_0^t dL(u,t)\,du.
$$
Best Answer
Hint:
To understand how to work with this type of integral, first consider an integral of Brownian motion:
$$I = \int_{0}^{T} B_t dt$$
The integral makes sense because Brownian motion has almost-surely continuous sample paths. Consider the approximation as a Riemann sum over a partition of $[0,T]$:
$$ S_n = \sum_{k=1}^{n} B_{t_k} (t_k - t_{k-1})$$
Now you can think of $S_n$ as a random variable with a multi-variate normal distribution. Try to establish properties of this random variable such as the mean, variance, and higher moments. You will need to use the fact that the Brownian motion at different times is correlated, i.e.,
$$E(B_{t_1}B_{t_2}) = min(t_1,t_2)$$
Then consider taking limits as $n \rightarrow \infty$ ($\max(t_k-t_{k-1}) \rightarrow 0$).
You will find for example that
$$E(I) = 0, E(I^2) = T^3/3, ...$$
Then you can move on to a more general case where the integrand is a function of the Brownian motion.