Solved – How to calculate confidence interval for a geometric mean

confidence intervaldistributionsgeometric mean

Apologies if this is confusing at all, I'm very unfamiliar with geometric means. For context, my data set is 35 month-end portfolio values. I found the month to month growth rate [Month(N)/Month(N-1)] – 1, such that I now have 34 observations and would like to estimate a month end value using the known previous month end's value. For example if I know what the ending value of the portfolio was last month, I would take that multiplied by a growth rate to get an estimate of this month's ending value +/- the margin of error.

I initially used the arithmetic mean of the growth rates, found the sample standard deviation and calculated a confidence interval to get my lower / upper bound growth rates.

I'm now doubting the accuracy of this method and have tried to use geometric mean instead. So currently I have my set of 34 growth rates except I did not subtract 1 so that all values are positive, calculated the geometric mean, and to calculate standard deviation used this wikipedia formula:
$$
\sigma_g = \exp\!\!\left(\sqrt{\frac{\sum_{i=1}^n\ln\!\big(\frac{x_i}{\mu_g}\big)^2}{n}} \right)
$$
I'm now at a loss as to how to calculate a 95% CI as I've looked through similar questions on this site as well as general searching the internet and am seeing different opinions on methods and formulas (I admittedly am also getting a bit lost in the underlying math).

Currently I'm using the formulas for a normal distribution to calculate a confidence interval based off the geometric standard deviation minus 1 (to get it back to a percentage), such that:

Standard Error = [(Geometric Stdev-1)/Sqrt(N)],
Margin of Error = [Standard Error * 1.96], and
CI = [Geometric Mean +/- Margin of Error]

Is this a reasonable approximation or should I be using a different method to calculate the CI?

Best Answer

You can compute the arithmetic mean of the log growth rate:

Let $V_t$ be the value of your portfolio at time $t$
Let $R_t = \frac{V_t}{V_{t-1}}$ be the growth rate of your portfolio from $t-1$ to $t$

The basic idea is to take logs and do your standard stuff. Taking logs transforms multiplication into a sum.

Let $r_t = \log R_t$ be the log growth rate.

$$\bar{r} = \frac{1}{T} \sum_{t=1}^T r_t \quad \quad s_r = \sqrt{\frac{1}{T-1} \sum_{t=1}^T \left( r_t - \bar{r}\right)^2}$$

Then your standard error $\mathit{SE}_{\bar{r}}$ for your sample mean $\bar{r}$ is given by:

$$ \mathit{SE}_{\bar{r}} = \frac{s_r}{\sqrt{T}}$$

The 95 percent confidence interval for $\mu_r = {\operatorname{E}[r_t]}$ would be approximately: $$\left( \bar{r} - 2 \mathit{SE}_{\bar{r}} , \bar{r} + 2 \mathit{SE}_{\bar{r}} \right)$$.

Exponentiate to get confidence interval for $e^{\mu_r}$

Since $e^x$ is a strictly increasing function, a 95 percent confidence interval for $e^{\mu_r}$ would be:

$$\left( e^{\bar{r} - 2 \mathit{SE}_{\bar{r}}} , e^{\bar{r} + 2 \mathit{SE}_{\bar{r}}} \right)$$

And we're done. Why are we done?

Observe $\bar{r} = \frac{1}{T} \sum_t r_t$ is the log of the geometric mean

Hence $e^{\bar{r}}$ is geometric mean of your sample. To show this, observe the geometric mean is given by:

$$ \mathit{GM} = \left(R_1R_2\ldots R_T\right)^\frac{1}{T}$$

Hence if we take the log of both sides:

\begin{align*} \log \mathit{GM} &= \frac{1}{T} \sum_{t=1}^T \log R_t \\ &= \bar{r} \end{align*}

Some example to build intuition:

Let's say you compute the mean log growth rate is $.02$. Then the geometric mean is $\exp(.02) \approx 1.0202$.
Let's say you compute the mean log growth rate is $-.05$, then the geometric mean is $\exp(-.05) = .9512$

For $x \approx 1$, we have $\log(x) \approx x - 1$ and for $y \approx 0$, we have $\exp(y) \approx y + 1$. Further away though, those tricks breka down:

Let's say you compute the mean log growth rate is $.69$, then the geometric mean mean is $\exp(.69) \approx 2$ (i.e. the value doubles every period).

If all your log growth rates $r_t$ are near zero (or equivalently $\frac{V_t}{V_{t-1}}$ is near 1, then you'll find that the geometric mean and the arithmetic mean will be quite close

Another answer that might be useful:

As this answer discusses, log differences are basically percent changes.

Comment: it's useful in finance to get comfortable thinking in logs. It's similar to thinking in terms of percent changes but mathematically cleaner.

Best Answer

You can compute the arithmetic mean of the log growth rate:

Exponentiate to get confidence interval for $e^{\mu_r}$

Observe $\bar{r} = \frac{1}{T} \sum_t r_t$ is the log of the geometric mean

Another answer that might be useful:

Related Solutions

Solved – Understanding confidence interval

Solved – Formula for confidence intervals for small samples and unknown population standard deviation

Related Question