You may use a computational method to approximate as accurate as wanted the probability density function of $I(t)=\int_0^t \cos(B(s))\,\mathrm{d}s$. I will do so for $0\leq t\leq 1$.
Consider the Karhunen-Loève expansion of $B(t)$ on $[0,1]$:
$$ B(t)=\sum_{j=1}^\infty \frac{\sqrt{2}}{\left(j-\frac12\right)\pi}\sin\left(\left(j-\frac12\right)\pi t\right)\xi_j, $$
where $\xi_1,\xi_2,\ldots$ are independent and $\text{Normal}(0,1)$ distributed random variables. The convergence of the series holds in $L^2([0,1]\times\Omega)$. Truncate
$$ B_J(t)=\sum_{j=1}^J \frac{\sqrt{2}}{\left(j-\frac12\right)\pi}\sin\left(\left(j-\frac12\right)\pi t\right)\xi_j, $$
and define $I_J(t)=\int_0^t \cos(B_J(s))\,\mathrm{d}s$.
It is easy to see that:
$I_J(t)\rightarrow I(t)$ in $L^2(\Omega)$, for each $0\leq t\leq 1$. Indeed, by Cauchy-Schwarz inequality, $\mathbb{E}[(I(t)-I_J(t))^2]\leq t\|B(t)-B_J(t)\|_{L^2([0,t]\times\Omega)}^2\rightarrow 0$, as $J\rightarrow\infty$.
$I_J(t)\rightarrow I(t)$ almost surely, for each $0\leq t\leq 1$. Indeed, for each fixed $\omega\in\Omega$, we know that $B_J(t)(\omega)\rightarrow B(t)(\omega)$ in $L^2([0,1])$, because deterministic Fourier series converge in $L^2$. Since $\cos$ is Lipschitz, $\cos(B_J(t)(\omega))\rightarrow \cos(B(t)(\omega))$ in $L^2([0,1])$. Then $I_J(t)(\omega)\rightarrow I(t)(\omega)$ for each $t\in[0,1]$ follows.
Although these two facts are not enough to have that the density functions of $\{I_J(t):J\geq1\}$ tend to the density function of $I(t)$, we have that the density function of $I_J(t)$ may be a very good candidate (recall that this is a computational method, not a proof). The good thing about $I_J(t)$ is that is consists of a finite number of $\xi_j$, so it is possible to obtain exact realizations of $I_J(t)$. And if we generate a sufficiently large number $M$ of realizations of $I_J(t)$, then a kernel density estimation allows obtaining an approximate density function for $I_J(t)$.
I have written a function in Mathematica to approximate the distribution of $I(T)$, for $0\leq T\leq 1$, using a truncation order $J$ and a number of simulations $M$:
distributionIT[T_, J_, simulations_] :=
Module[{realizationsIT, simulation, xi, BJ, distribution},
realizationsIT = ConstantArray[0, simulations];
For[simulation = 1, simulation <= simulations, simulation++,
xi = RandomVariate[NormalDistribution[0, 1], J];
BJ[t_] :=
Sqrt[2]*Sum[
Sin[(j - 0.5)*Pi*t]/((j - 0.5)*Pi)*xi[[j]], {j, 1, J}];
realizationsIT[[simulation]] = NIntegrate[Cos[BJ[t]], {t, 0, T}];
];
distribution = SmoothKernelDistribution[realizationsIT];
Return[distribution];
];
Let us do a numerical example. Choose $T=1$. Write
distribution1 = distributionIT[1, 40, 50000];
plot1 = Plot[PDF[distribution1, x], {x, -2, 2}, Frame -> True, PlotRange -> All];
distribution2 = distributionIT[1, 50, 100000];
plot2 = Plot[PDF[distribution2, x], {x, -2, 2}, Frame -> True, PlotRange -> All];
Legended[Show[plot1, plot2], LineLegend[{Green, Blue}, {"J=40, M=50000", "J=50, M=100000"}]]
We plot the estimated density function for $J=40$, $M=50000$ and $J=50$, $M=100000$. We observe no differences, so our method provides a good approximation of the probability density function of $I(1)$.
Similar computations for $T=0.34$ give the following plot:
If you plot the approximate density function for smaller $T$, you will see that at the end one gets a Dirac delta at $0$, which agrees with the fact that $I(0)=0$ almost surely.
Remark: Computational methods are of constant use in research to approximate probabilistic features of response processes to random differential equations. See for example [M. A. El-Tawil, The approximate solutions of some stochastic differential equations using transformations, Applied
Mathematics and Computation 164 (1) 167–178, 2005], [D. Xiu, Numerical Methods for Stochastic Computations. A Spectral Method Approach, Princeton University Press, 2010], [L. Villafuerte, B. M. Chen-Charpentier, A random differential transform method: Theory and applications, Applied Mathematics Letters, 25 (10) 1490-1494, 2012].
Every step is correct, including the identities $Y_t=\displaystyle\int_0^tW_s\mathrm ds=\int_0^t(t-s)\mathrm dW_s$ and $[Y]_t=0$, except the underlying assertion that the process $Y^2-[Y]$ should be a martingale and consequently that $\mathbb E(Y_t^2)$ and $\mathbb E([Y]_t)$ should coincide for every $t$.
Recall that the quadratic variation process $[Y]$ is defined as
$[Y]_t=Y_t^2-2\displaystyle\int_0^tY_s\mathrm dY_s$. In particular, $Y^2-[Y]=2\displaystyle\int_0^\cdot Y\mathrm dY$ is a martingale when $Y$ is, but not otherwise.
Here, $Y$ is not a martingale since, for every $s\lt t$, $\mathbb E(Y_t\mid\mathcal F_s)=Y_s+(t-s)W_s\ne Y_s$.
Edit: A general method (which is not a trick) to compute $\mathbb E(Y_t^2)$ is indeed to use Itô's formula for $Y^2$. One gets
$$
\mathrm d(Y_t^2)=2Y_t\mathrm dY_t+\mathrm d[Y]_t=2Y_tW_t\mathrm dt.
$$
Fubini on $\Omega\times[0,t]$ with the measure $\mathbb P\otimes\mathrm{Leb}$ yields
$$
\mathbb E(Y_t^2)=\mathbb E\left(\int_0^t2Y_sW_s\mathrm ds\right)=\int_0^t2\mathbb E(Y_sW_s)\mathrm ds.
$$
Fubini again yields
$$
2\mathbb E(Y_sW_s)=2\int_0^s\mathbb E(W_uW_s)\mathrm du=\int_0^s2u\mathrm du=s^2,
$$
hence
$$
\mathbb E(Y_t^2)=\int_0^ts^2\mathrm ds=\frac{t^3}3.
$$
Best Answer
This addresses the question cited as a motivation.
For every $t\geqslant0$, introduce $X_t=\int\limits_{0}^{t}\cos(B_s)\,\textrm{d}s$ and $m(t)=\mathrm E(\cos(B_t))=\mathrm E(\cos(\sqrt{t}Z))$, where $Z$ is standard normal. Then $\mathrm E(X_t)=\int\limits_{0}^{t}m(s)\,\textrm{d}s$ and $\mathrm E(X_t^2)=\int\limits_{0}^{t}\int\limits_{u}^{t}2\mathrm E(\cos(B_s)\cos(B_u))\,\textrm{d}s\textrm{d}u$.
For every $s\geqslant u\geqslant0$, one has $2\cos(B_s)\cos(B_u)=\cos(B_s+B_u)+\cos(B_s-B_u)$. Furthermore, $B_s+B_u=2B_u+(B_s-B_u)$ is normal with variance $4u+(s-u)=s+3u$ and $B_s-B_u$ is normal with variance $s-u$. Hence, $2\mathrm E(\cos(B_s)\cos(B_u))=m(s+3u)+m(s-u)$, which implies $$ \mathrm E(X_t^2)=\int\limits_{0}^{t}\int\limits_{u}^{t}(m(s+3u)+m(s-u))\,\textrm{d}s\textrm{d}u. $$ Since $m(t)=\mathrm e^{-t/2}$, this yields after some standard computations, $\mathrm E(X_t)=2(1-\mathrm e^{-t/2})$ and $$ \mathrm E(X_t^2)=2t-\frac13(1-\mathrm e^{-2t})-\frac83(1-\mathrm e^{-t/2}). $$ Sanity check: When $t\to0^+$, $\mathrm E(X_t^2)=t^2+o(t^2)$.
To compute the integral $J_t=\mathrm E\left[ \cos(B_t)\int\limits_{0}^{t} \sin(B_s)\,\textrm{d}B_s \right]$, one can start with Itô's formula $$ \cos(B_t)=1-\int\limits_{0}^{t} \sin(B_s)\,\textrm{d}B_s-\frac12\int\limits_{0}^{t} \cos(B_s)\,\textrm{d}s, $$ hence $$ J_t=\mathrm E(\cos(B_t))-\mathrm E(\cos^2(B_t))-\frac12\int\limits_{0}^{t} \mathrm E(\cos(B_t)\cos(B_s))\,\textrm{d}s, $$ and it seems each term can be computed easily.