Solved – How to calculate confidence interval for a geometric mean

confidence intervaldistributionsgeometric mean

Apologies if this is confusing at all, I'm very unfamiliar with geometric means. For context, my data set is 35 month-end portfolio values. I found the month to month growth rate [Month(N)/Month(N-1)] – 1, such that I now have 34 observations and would like to estimate a month end value using the known previous month end's value. For example if I know what the ending value of the portfolio was last month, I would take that multiplied by a growth rate to get an estimate of this month's ending value +/- the margin of error.

I initially used the arithmetic mean of the growth rates, found the sample standard deviation and calculated a confidence interval to get my lower / upper bound growth rates.

I'm now doubting the accuracy of this method and have tried to use geometric mean instead. So currently I have my set of 34 growth rates except I did not subtract 1 so that all values are positive, calculated the geometric mean, and to calculate standard deviation used this wikipedia formula:
$$
\sigma_g = \exp\!\!\left(\sqrt{\frac{\sum_{i=1}^n\ln\!\big(\frac{x_i}{\mu_g}\big)^2}{n}} \right)
$$
I'm now at a loss as to how to calculate a 95% CI as I've looked through similar questions on this site as well as general searching the internet and am seeing different opinions on methods and formulas (I admittedly am also getting a bit lost in the underlying math).

Currently I'm using the formulas for a normal distribution to calculate a confidence interval based off the geometric standard deviation minus 1 (to get it back to a percentage), such that:

  • Standard Error = [(Geometric Stdev-1)/Sqrt(N)],
  • Margin of Error = [Standard Error * 1.96], and
  • CI = [Geometric Mean +/- Margin of Error]

Is this a reasonable approximation or should I be using a different method to calculate the CI?

Best Answer

You can compute the arithmetic mean of the log growth rate:

  • Let $V_t$ be the value of your portfolio at time $t$
  • Let $R_t = \frac{V_t}{V_{t-1}}$ be the growth rate of your portfolio from $t-1$ to $t$

The basic idea is to take logs and do your standard stuff. Taking logs transforms multiplication into a sum.

  • Let $r_t = \log R_t$ be the log growth rate.

$$\bar{r} = \frac{1}{T} \sum_{t=1}^T r_t \quad \quad s_r = \sqrt{\frac{1}{T-1} \sum_{t=1}^T \left( r_t - \bar{r}\right)^2}$$

Then your standard error $\mathit{SE}_{\bar{r}}$ for your sample mean $\bar{r}$ is given by:

$$ \mathit{SE}_{\bar{r}} = \frac{s_r}{\sqrt{T}}$$

The 95 percent confidence interval for $\mu_r = {\operatorname{E}[r_t]}$ would be approximately: $$\left( \bar{r} - 2 \mathit{SE}_{\bar{r}} , \bar{r} + 2 \mathit{SE}_{\bar{r}} \right)$$.

Exponentiate to get confidence interval for $e^{\mu_r}$

Since $e^x$ is a strictly increasing function, a 95 percent confidence interval for $e^{\mu_r}$ would be:

$$\left( e^{\bar{r} - 2 \mathit{SE}_{\bar{r}}} , e^{\bar{r} + 2 \mathit{SE}_{\bar{r}}} \right)$$

And we're done. Why are we done?

Observe $\bar{r} = \frac{1}{T} \sum_t r_t$ is the log of the geometric mean

Hence $e^{\bar{r}}$ is geometric mean of your sample. To show this, observe the geometric mean is given by:

$$ \mathit{GM} = \left(R_1R_2\ldots R_T\right)^\frac{1}{T}$$

Hence if we take the log of both sides:

\begin{align*} \log \mathit{GM} &= \frac{1}{T} \sum_{t=1}^T \log R_t \\ &= \bar{r} \end{align*}

Some example to build intuition:

  • Let's say you compute the mean log growth rate is $.02$. Then the geometric mean is $\exp(.02) \approx 1.0202$.
  • Let's say you compute the mean log growth rate is $-.05$, then the geometric mean is $\exp(-.05) = .9512$

For $x \approx 1$, we have $\log(x) \approx x - 1$ and for $y \approx 0$, we have $\exp(y) \approx y + 1$. Further away though, those tricks breka down:

  • Let's say you compute the mean log growth rate is $.69$, then the geometric mean mean is $\exp(.69) \approx 2$ (i.e. the value doubles every period).

If all your log growth rates $r_t$ are near zero (or equivalently $\frac{V_t}{V_{t-1}}$ is near 1, then you'll find that the geometric mean and the arithmetic mean will be quite close

Another answer that might be useful:

As this answer discusses, log differences are basically percent changes.

Comment: it's useful in finance to get comfortable thinking in logs. It's similar to thinking in terms of percent changes but mathematically cleaner.

Related Question