First let's address the case $\Sigma = \sigma\mathbb{I}$. At the end is the (easy) generalization to arbitrary $\Sigma$.
Begin by observing the inner product is the sum of iid variables, each of them the product of two independent Normal$(0,\sigma)$ variates, thereby reducing the question to finding the mgf of the latter, because the mgf of a sum is the product of the mgfs.
The mgf can be found by integration, but there's an easier way. When $X$ and $Y$ are standard normal,
$$XY = ((X+Y)/2)^2 - ((X-Y)/2)^2$$
is a difference of two independent scaled Chi-squared variates. (The scale factor is $1/2$ because the variances of $(X\pm Y)/2$ equal $1/2$.) Because the mgf of a chi-squared variate is $1/\sqrt{1 - 2\omega}$, the mgf of $((X+Y)/2)^2$ is $1/\sqrt{1-\omega}$ and the mgf of $-((X-Y)/2)^2$ is $1/\sqrt{1+\omega}$. Multiplying, we find that the desired mgf equals $1/\sqrt{1-\omega^2}$.
(For later reference, notice that when $X$ and $Y$ are rescaled by $\sigma$, their product scales by $\sigma^2$, whence $\omega$ should scale by $\sigma^2$, too.)
This should look familiar: up to some constant factors and a sign, it looks like the probability density for a Student t distribution with $0$ degrees of freedom. (Indeed, if we had been working with characteristic functions instead of mgfs, we would obtain $1/\sqrt{1 + \omega^2}$, which is even closer to a Student t PDF.) Never mind that there is no such thing as a Student t with $0$ dfs--all that matters is that the mgf be analytic in a neighborhood of $0$ and this clearly is (by the Binomial Theorem).
It follows immediately that the distribution of the inner product of these iid Gaussian $n$-vectors has mgf equal to the $n$-fold product of this mgf,
$$(1 - \omega^2 \sigma^4)^{-n/2}, \quad n=1, 2, \ldots.$$
By looking up the characteristic function of the Student t distributions, we deduce (with a tiny bit of algebra or an integration to find the normalizing constant) that the PDF itself is given by
$$f_{n,\sigma}(x) = \frac{2^{\frac{1-n}{2}} |x|^{\frac{n-1}{2}} K_{\frac{n-1}{2}}\left(\frac{|x|}{\sigma ^2}\right)}{\sqrt{\pi } \sigma ^4 \Gamma \left(\frac{n}{2}\right)}$$
($K$ is a Bessel function).
For instance, here is a plot of that PDF superimposed on the histogram of a random sample of $10^5$ such inner products where $\sigma=1/2$ and $n=3$:
It's harder to confirm the accuracy of the mgf from a simulation, but note (from the Binomial Theorem) that
$$(1 + t^2 \sigma^4)^{-3/2} = 1-\frac{3 \sigma ^4 t^2}{2}+\frac{15 \sigma ^8 t^4}{8}-\frac{35 \sigma ^{12} t^6}{16}+\frac{315 \sigma ^{16} t^8}{128}+\ldots,$$
from which we may read off the moments (divided by factorials). Due to the symmetry about $0$, only the even moments matter. For $\sigma=1/2$ we obtain the following values, to be compared to the raw moments of this simulation:
k mgf simulation/k!
2 0.09375 0.09424920
4 0.00732422 0.00740436
6 0.00053406 0.00054128
8 0.00003755 0.00003674
10 2.58 e-6 2.17 e-6
As to be expected, the high moments of the simulation will begin departing from the moments given by the mgf; but at least up through the tenth moment, there is excellent agreement.
Incidentally, when $n=2$ the distribution is bi-exponential.
To handle the general case, begin by noting that the inner product is a coordinate-independent object. We may therefore take the principal directions (eigenvectors) of $\Sigma$ as coordinates. In these coordinates the inner product is the sum of independent products of independent Normal variates, each component distributed with a variance equal to its associated eigenvalue. Thus, letting the nonzero eigenvalues be $\sigma_1^2, \sigma_2^2, \ldots, \sigma_d^2$ (with $0 \le d \le n$), the mgf must equal
$$\left(\prod_{i=1}^d (1 - \omega^2\sigma_i^4)\right)^{-1/2}.$$
To confirm that I made no error in this reasoning, I worked out an example where $\Sigma$ is the matrix
$$\left(
\begin{array}{ccc}
1 & \frac{1}{2} & -\frac{1}{8} \\
\frac{1}{2} & 1 & -\frac{1}{4} \\
-\frac{1}{8} & -\frac{1}{4} & \frac{1}{2}
\end{array}
\right)$$
and computed that its eigenvalues are
$$\left(\sigma_1^2, \sigma_2^2, \sigma_3^2\right) = \left(\frac{1}{16} \left(17+\sqrt{65}\right),\frac{1}{16} \left(17-\sqrt{65}\right),\frac{3}{8}\right)\approx \left(1.56639,0.558609,0.375\right).$$
It was possible to compute the PDF by numerically evaluating the Fourier Transform of the characteristic function (as derived from the mgf formula given here): a plot of this PDF is shown in the following figure as a red line. At the same time, I generated $10^6$ iid variates $X_i$ from the Normal$(0,\Sigma)$ distribution and another $10^6$ iid variates $Y_i$ in the same way, and computed the $10^6$ dot products $X_i\cdot Y_i$. The plot shows the histogram of these dot products (omitting some of the most extreme values--the range was from $-12$ to $15$):
As before, the agreement is excellent. Furthermore, the moments match well through the eighth and reasonably well even at the tenth:
k mgf simulation/k!
2 1.45313 1.45208
4 2.59009 2.59605
6 5.20824 5.29333
8 11.0994 11.3115
10 24.4166 22.9982
Addendum
(Added 9 August 2013.)
$f_{n,\sigma}$ is an instance of the variance-gamma distribution, which originally was defined as " the normal variance-mean mixture where the mixing density is the gamma distribution." It has a standard location ($0$), asymmetry parameter of $0$ (it is symmetric), scale parameter $\sigma^2$, and shape parameter $n/2$ (according to the Wikipedia parameterization).
Because (as is well-known) a uniform distribution on the unit sphere $S^{D-1}$ is obtained by normalizing a $D$-variate normal distribution and the dot product $t$ of normalized vectors is their correlation coefficient, the answers to the three questions are:
$u= (t+1)/2$ has a Beta$((D-1)/2,(D-1)/2)$ distribution.
The variance of $t$ equals $1/D$ (as speculated in the question).
The standardized distribution of $t$ approaches normality at a rate of $O\left(\frac{1}{D}\right).$
Method
The exact distribution of the dot product of unit vectors is easily obtained geometrically, because this is the component of the second vector in the direction of the first. Since the second vector is independent of the first and is uniformly distributed on the unit sphere, its component in the first direction is distributed the same as any coordinate of the sphere. (Notice that the distribution of the first vector does not matter.)
Finding the Density
Letting that coordinate be the last, the density at $t \in [-1,1]$ is therefore proportional to the surface area lying at a height between $t$ and $t+dt$ on the unit sphere. That proportion occurs within a belt of height $dt$ and radius $\sqrt{1-t^2},$ which is essentially a conical frustum constructed out of an $S^{D-2}$ of radius $\sqrt{1-t^2},$ of height $dt$, and slope $1/\sqrt{1-t^2}$. Whence the probability is proportional to
$$\frac{\left(\sqrt{1 - t^2}\right)^{D-2}}{\sqrt{1 - t^2}}\,dt = (1 - t^2)^{(D-3)/2} dt.$$
Letting $u=(t+1)/2 \in [0,1]$ entails $t = 2u-1$. Substituting that into the preceding gives the probability element up to a normalizing constant:
$$f_D(u)du \; \propto \; (1 - (2u-1)^2)^{(D-3)/2} d(2u-1) = 2^{D-2}(u-u^2)^{(D-3)/2}du.$$
It is immediate that $u=(t+1)/2$ has a Beta$((D-1)/2, (D-1)/2)$ distribution, because (by definition) its density also is proportional to
$$u^{(D-1)/2-1}\left(1-u\right)^{(D-1)/2-1} = (u-u^2)^{(D-3)/2} \; \propto \; f_D(u).$$
Determining the Limiting Behavior
Information about the limiting behavior follows easily from this using elementary techniques: $f_D$ can be integrated to obtain the constant of proportionality $\frac{\Gamma \left(\frac{D}{2}\right)}{\sqrt{\pi } \Gamma \left(\frac{D-1}{2}\right)}$; $t^k f_D(t)$ can be integrated (using properties of Beta functions, for instance) to obtain moments, showing that the variance is $1/D$ and shrinks to $0$ (whence, by Chebyshev's Theorem, the probability is becoming concentrated near $t=0$); and the limiting distribution is then found by considering values of the density of the standardized distribution, proportional to $f_D(t/\sqrt{D}),$ for small values of $t$:
$$\eqalign{
\log(f_D(t/\sqrt{D})) &= C(D) + \frac{D-3}{2}\log\left(1 - \frac{t^2}{D}\right) \\
&=C(D) -\left(1/2 + \frac{3}{2D}\right)t^2 + O\left(\frac{t^4}{D}\right) \\
&\to C -\frac{1}{2}t^2
}$$
where the $C$'s represent (log) constants of integration. Evidently the rate at which this approaches normality (for which the log density equals $-\frac{1}{2}t^2$) is $O\left(\frac{1}{D}\right).$
This plot shows the densities of the dot product for $D=4, 6, 10$, as standardized to unit variance, and their limiting density. The values at $0$ increase with $D$ (from blue through red, gold, and then green for the standard normal density). The density for $D=1000$ would be indistinguishable from the normal density at this resolution.
Best Answer
Hey Yaroslav, you really do not have to hurry accepting my answer on MO and are more than welcomed to ask further details :).
Since you reformulate the question in 3-dim I can see exactly what you want to do. In MO post I thought you only need to calculate the largest cosine between two random variables. Now the problem seems tougher.
First, we calculate the normalized Gaussian $\frac{X}{\|X\|}$, which is not a trivial job since it actually has a name "projected normal distribution" because we can rewrite the multivariate normal density $X$ in terms of its polar coordinate $(\|X\|,\frac{X}{\|X\|})=(r,\boldsymbol{\theta})$. And the marginal density for $\boldsymbol{\theta}$ can be obtained in $$\int_{\mathbb{R}^{+}}f(r,\boldsymbol{\theta})dr$$
In this step we can obtain distributions $\mathcal{PN}_{k}$ for $\frac{X}{\|X\|}\perp\frac{Y}{\|Y\|}$, and hence their joint density $(\frac{X}{\|X\|},\frac{Y}{\|Y\|})$ due to independence. As for a concrete density function of projected normal distribution, see [Mardia&Peter] Chap 10. or [2] Equation (4) or [1] . (Notice that in [2] they also assume a special form of covariance matrix $\Sigma=\left(\begin{array}{cc} \Gamma & \gamma\\ \gamma' & 1 \end{array}\right)$)
Second, since we already obtained their joint density, their inner product can be readily derived using transformation formula $$(\frac{X}{\|X\|},\frac{Y}{\|Y\|})\mapsto\frac{X}{\|X\|}\cdot\frac{Y}{\|Y\|}$$. Also see [3].
As long as we computed the density, the second moment is only a problem of integration.
Reference
[Mardia&Peter]Mardia, Kanti V., and Peter E. Jupp. Directional statistics. Vol. 494. John Wiley & Sons, 2009.
[1]Wang, Fangpo, and Alan E. Gelfand. "Directional data analysis under the general projected normal distribution." Statistical methodology 10.1 (2013): 113-127.
[2]Hernandez-Stumpfhauser, Daniel, F. Jay Breidt, and Mark J. van der Woerd. "The general projected normal distribution of arbitrary dimension: modeling and Bayesian inference." Bayesian Analysis (2016). https://projecteuclid.org/download/pdfview_1/euclid.ba/1453211962
[3]Moment generating function of the inner product of two gaussian random vectors