To keep notation simple, we write $S_t$ instead of $S_t^i$, $\sigma_t^j$ instead of $\sigma_{t}^{ij}$ and so on. That's okay, because $i$ is a fixed number throughout this calculation.
Suppose $(X_t)_{t \geq 0}$ is an Itô process of the form
$$dX_t = b(t) \, dt + \eta(t) \, dW_t$$
where $\eta = (\eta_1,\ldots,\eta_m)$ and $(W_t)_{t \geq 0}$ is an $m$-dimensional Brownian motion. Then Itô's formula states that
$$\begin{align*} f(X_t)-f(X_0) &= \int_0^t f'(X_s) \, dX_s + \frac{1}{2} \int_0^t f''(X_s) \, d\langle X \rangle_s \end{align*}$$
for $f \in C^2$ where
$$\begin{align*} \int_0^t f'(X_s) \, dX_s &:= \int_0^t f'(X_s) \eta_s \, dW_s + \int_0^t f'(X_s) b(s) \, ds \\ d\langle X \rangle_s &:= \sum_{j=1}^m \eta_j^2(s) \, ds. \end{align*}$$
In your setting, we have $$b(t) = \left(\mu_t - \frac{1}{2} \sum_{j=1}^m \sigma_t^j \right)^2 \qquad \eta(t) := (\sigma_t^1,\ldots,\sigma_t^m).$$
Applying Itô's formula for $f(x) := \exp(x)$ shows that $(S_t)_{t \geq 0}$ is a solution to the given SDE.
Since $\sigma$ is bounded and continuous, it is not difficult to see that
$$\sum_{j=0}^{n-1} \sigma_{j/n} (W_{(j+1)/n}-W_{j/n}) \xrightarrow[]{n \to \infty} \int_0^1 \sigma_s \, dW_s \quad \text{in probability.} \tag{$\star$}$$
This implies
$$\begin{align*} \mathbb{E} \left( \exp \left[ i \xi \int_0^1 \sigma_s \, dW_s \right] \mid \mathcal{F}_{\sigma} \right) &= \lim_{n \to \infty} \mathbb{E} \left( \exp \left[ i \xi \sum_{j=0}^{n-1} \sigma_{j/n} (W_{(j+1)/n}-W_{j/n}) \right] \mid \mathcal{F}_{\sigma} \right) \\ &= \lim_{n \to \infty} \mathbb{E} \left( \prod_{j=0}^{n-1} \exp \left[ i \xi \sigma_{j/n} (W_{(j+1)/n}-W_{j/n}) \right] \mid \mathcal{F}_{\sigma} \right). \end{align*}$$
Since $(W_t)_{t \in [0,1]}$ and $\mathcal{F}_{\sigma}$ are independent, we obtain that
$$\mathbb{E} \left( \exp \left[ i \xi \int_0^1 \sigma_s \, dW_s \right] \mid \mathcal{F}_{\sigma} \right) = \lim_{n \to \infty} \mathbb{E} \left( \prod_{j=0}^{n-1} \exp \left( i \xi x_j (W_{(j+1)/n}-W_{j/n}) \right) \right] \bigg|_{x_j = \sigma_{j/n}};$$
using that $W_{(j+1)/n}-W_{j/n} \sim N(0,1/n$, $j=0,\ldots,n-1$ are independent, we find that
$$\begin{align*} \mathbb{E} \left( \exp \left[ i \xi \int_0^1 \sigma_s \, dW_s \right] \mid \mathcal{F}_{\sigma} \right) &= \lim_{n \to \infty} \prod_{j=0}^{n-1} \exp \left(- \frac{1}{2}\xi^2 x_j^2 \frac{1}{n} \right)\bigg|_{x_j = \sigma_{j/n}} \\ &= \lim_{n \to \infty} \exp \left(- \frac{1}{2} \xi^2 \sum_{j=1}^n \sigma_{j/n}^2 \frac{1}{n} \right). \end{align*}$$
The expression on the right-hand side is a Riemann sum and therefore we conclude that
$$ \mathbb{E} \left( \exp \left[ i \xi \int_0^1 \sigma_s \, dW_s \right] \mid \mathcal{F}_{\sigma} \right) = \exp \left(- \frac{1}{2}\xi^2 \int_0^1 \sigma_s^2 \, ds \right). \tag{1}$$
Finally, we note that
$$i^k \mathbb{E}(X^k \mid \mathcal{F}) =\frac{d^k}{d\xi^k} \mathbb{E}(e^{i \xi X} \mid \mathcal{F}_{\sigma}) \bigg|_{\xi=0}$$
for any real-valued random variable $X$ such that $\mathbb{E}(|X|^k)<\infty$; differentiating $(1)$ $k$ times with respect to $\xi$ and evaluating at $\xi=0$ we conclude that
$$ \mathbb{E} \left( \left[ \int_0^1 \sigma_s \, dW_s \right]^k \mid \mathcal{F}_{\sigma} \right)=\mathbb{E}(U^k) \left( \int_0^1 \sigma_s^2 \, ds \right)^{k/2}$$
for $U \sim N(0,1)$.
Remark: Instead of calculating the conditional characteristic function, you can also show that the convergence in $(\star)$ holds in $L^k$ and then use this convergence to conclude that $$\mathbb{E} \left( \left( \int_0^1 \sigma_s \, dW_s \right)^k \mid \mathcal{F}_{\sigma} \right) = \lim_{n \to \infty} \left( \left( \sum_{j=0}^{n-1} \sigma_{j/n} (W_{(j+1)/n}-W_{j/n}) \right)^k \mid \mathcal{F}_{\sigma} \right);$$ then you can proceed with similar arguments as above.
Best Answer
Your answer depends, as you guessed on the process $u \mapsto \sigma_u$.
You can amplify your approach using the Ito-isometry with the BDG-inequality: \begin{align} \mathbb{E}[|\int_0^t\sigma_udW_u - & \int_0^s\sigma_u \,dW_u|^{2p}] \stackrel{\text{BDG}}{\leq} c(p) \mathbb{E}[(\int_s^t |\sigma_u|^2 \, du)^p] \\ & \leq c(p) \mathbb{E}[(t-s)^{p-1} \int_s^t |\sigma_u|^{2p} \, du] \quad (\text{Hölder-ineq. on} \int_s^t)\\ & = c(p) (t-s)^{p-1} \int_s^t \mathbb{E} [|\sigma_u|^{2p} ]\, du \, . \end{align} Here $(t-s)^{p-1}$ appears from the use of the Hölder inequality with the integrand $1$. Now you can think of specifying some integrability properties of $u\to \sigma_u$: bounded, uniformly in $L^{2p}$, etc. and continue with the Kolmogorov-Centsov theorem.
I think, however, as you asked for SDE, it will suffice to consider that situation: $$ dX_t = \sigma(X_t) \, dW_t $$ where $\sigma:\mathbb{R} \to \mathbb{R}$ and $W$ is a standard Brownian motion. If we assume a further condition, called the linear growth condition on $\sigma$: $$ |\sigma(x)| \leq c_1 (1+ |x|)\, , \quad x \in \mathbb{R} \, ,$$ your claim on almost Hölder $\frac{1}{2}$-regularity is true. In order to ensure existence of a global solution (i.e. for all times $t\geq 0$) this condition is very natural.
Explanation:\ You need control over $\sup_{s\leq u \leq t} \mathbb{E}[|\sigma(X_u)|^{2p}]$ in that case. Assuming this growth condition, one has the following estimate $$ \mathbb{E}[|\sigma(X_u)|^{2p}] \leq (2c_1)^{2p} (1+ \mathbb{E}[|X_u|^{2p}] ) . $$ So we need the $p$-th moment of $X_u$. We can obtain a bound on that again using BDG-inequality: \begin{align} \mathbb{E} [|X_u|^{2p}] & = \mathbb{E} [ |\int_0^u \sigma(X_v) \, dW_v |^{2p}] \\ & \leq c(p) \mathbb{E}[ (\int_0^u |\sigma(X_v)|^2 \, dv)^{p} ] \\ & \leq c(p) \mathbb{E} [u^{p-1} \int_0^u |\sigma(X_v)|^{2p} \, dv ] \quad (\text{Hölder-ineq. on} \int_0^u) \\ & \leq c(p) c_1^{2p}u^{p-1} \mathbb{E} [ \int_0^u (1+|X_v|)^{2p} \, dv ]\\ & \leq c(p) (2c_1)^{2p} u^{p-1} (u + \int_0^u \mathbb{E} [|X_v|^{2p}] \,dv ). \end{align} Now you need to apply Gronwall's lemma to obtain a bound C(c,c_1,c_0 ,p, u) on $\mathbb{E} [|X_u|^{2p}]$ with the property that $C(c,c_1,c_0,p, u) \leq C(c,c_1,c_0,p, t)$ for $u \leq t$. Then you can continue: \begin{align} \mathbb{E}[|X_t - X_s|^{2p}] & \leq c(p) (t-s)^{p-1} \int_s^t C(c,c_1,c_0,p,u) \, du \\ & \leq c(p) (t-s)^{p-1} \int_s^t C(c,c_1,c_0,p,t) \, du \\ & \leq c(p) C(c,c_1,c_0,p,t) (t-s)^p. \end{align} This allows you to apply Kolmogorov-Centsov.
This also works fine for SDEs including a drift term $+b(X_t)\, dt$.
As you ask for references: the main source of the description is an article by Dalang, however on SPDE-regularity: Theorem 13 in http://ejp.ejpecp.org/article/view/43/85 and also the appendix in Mytnik, Perkins and Sturm,2006.
It is not too unlikely that this kind of calculation can also be found in a textbook on stochastic analysis.