Solved – Can the standard deviation be calculated for harmonic mean

harmonic meanstandard deviation

Can the standard deviation be calculated for the harmonic mean? I understand that the standard deviation can be calculated for arithmetic mean, but if you have harmonic mean, how do you calculate the standard deviation or CV?

Best Answer

The harmonic mean $H$ of random variables $X_1,...,X_n$ is defined as

$$H=\frac{1}{\frac{1}{n}\sum_{i=1}^n\frac{1}{X_i}}$$

Taking moments of fractions is a messy business, so instead I would prefer working with the $1/H$. Now

$$\frac{1}{H}=\frac{1}{n}\sum_{i=1}^n\frac{1}{X_i}$$.

Usin central limit theorem we immediately get that

$$\sqrt{n}\left(H^{-1}-EX_1^{-1}\right)\to N(0,VarX_1^{-1})$$

if of course $VarX_1^{-1}<\infty$ and $X_i$ are iid, since we simple work with arithmetic mean of variables $Y_i=X_i^{-1}$.

Now using delta method for function $g(x)=x^{-1}$ we get that

$$\sqrt{n}(H-(EX_1^{-1})^{-1})\to N\left(0, \frac{VarX_1^{-1}}{(EX_1^{-1})^4}\right)$$

This result is asymptotic, but for simple applications it might suffice.

Update As @whuber rightfully points out, simple applications is a misnomer. The central limit theorem holds only if $VarX_1^{-1}$ exists, which is quite a restrictive assumption.

Update 2 If you have a sample, then to calculate the standard deviation, simply plug sample moments into the formula. So for sample $X_1,...,X_n$, the estimate of harmonic mean is

\begin{align} \hat{H}=\frac{1}{\frac{1}{n}\sum_{i=1}^n\frac{1}{X_i}} \end{align}

the sample moments $EX_1^{-1}$ and $Var(X_1^{-1})$ respectively are:

\begin{align} \hat{\mu}_{R}&=\frac{1}{n}\sum_{i=1}^n\frac{1}{X_i}\\\\ \hat{\sigma}_{R}^2&=\frac{1}{n}\sum_{i=1}^n\left(\frac{1}{X_i}-\mu_R\right)^2 \end{align}

here $R$ stands for reciprocal.

Finally the approximate formula for standard deviation of $\hat{H}$ is

\begin{align*} sd(\hat{H})=\sqrt{\frac{\hat{\sigma}_R^2}{n\hat{\mu}_R^4}} \end{align*}

I ran some Monte-Carlo simulations for random variables uniformly distributed in interval $[2,3]$. Here is the code:

hm <- function(x)1/mean(1/x)
sdhm <- function(x)sqrt((mean(1/x))^(-4)*var(1/x)/length(x))

n<-1000

nn <- c(10,30,50,100,500,1000,5000,10000)

N<-1000

mc<-foreach(n=nn,.combine=rbind) %do% {

    rr <- matrix(runif(n*N,min=2,max=3),nrow=N)

    c(n,mean(apply(rr,1,sdhm)),sd(apply(rr,1,sdhm)),sd(apply(rr,1,hm)))

}
colnames(mc) <- c("n","DeltaSD","sdDeltaSD","trueSD")

> mc
             n     DeltaSD    sdDeltaSD      trueSD
result.1    10 0.089879211 1.528423e-02 0.091677622
result.2    30 0.052870477 4.629262e-03 0.051738941
result.3    50 0.040915607 2.705137e-03 0.040257673
result.4   100 0.029017031 1.407511e-03 0.028284458
result.5   500 0.012959582 2.750145e-04 0.013200580
result.6  1000 0.009139193 1.357630e-04 0.009115592
result.7  5000 0.004094048 2.685633e-05 0.004070593
result.8 10000 0.002894254 1.339128e-05 0.002964259

I simulated N samples of n sized sample. For each n sized sample I calculated estimate of standard estimation (function sdhm). Then I compare the mean and standard deviation of these estimates with the sample standard deviation of harmonic mean estimated for each sample, which supposably should be the true standard deviation of harmonic mean.

As you can see the results are quite good even for moderate sample sizes. Of course uniform distribution is a very well behaved one, so it is not surprising that results are good. I'll leave for someone else to investigate the behaviour for other distributions, the code is very easy to adapt.

Note: In previous version of this answer there was an error in the result of delta method, incorrect variance.