How does harmonic mean handle zero values? what would the harmonic mean of {3, 4, 5, 0} be since $1/0=\infty$?
Solved – Harmonic mean with zero value
harmonic meanmean
Related Solutions
The harmonic mean $H$ of random variables $X_1,...,X_n$ is defined as
$$H=\frac{1}{\frac{1}{n}\sum_{i=1}^n\frac{1}{X_i}}$$
Taking moments of fractions is a messy business, so instead I would prefer working with the $1/H$. Now
$$\frac{1}{H}=\frac{1}{n}\sum_{i=1}^n\frac{1}{X_i}$$.
Usin central limit theorem we immediately get that
$$\sqrt{n}\left(H^{-1}-EX_1^{-1}\right)\to N(0,VarX_1^{-1})$$
if of course $VarX_1^{-1}<\infty$ and $X_i$ are iid, since we simple work with arithmetic mean of variables $Y_i=X_i^{-1}$.
Now using delta method for function $g(x)=x^{-1}$ we get that
$$\sqrt{n}(H-(EX_1^{-1})^{-1})\to N\left(0, \frac{VarX_1^{-1}}{(EX_1^{-1})^4}\right)$$
This result is asymptotic, but for simple applications it might suffice.
Update As @whuber rightfully points out, simple applications is a misnomer. The central limit theorem holds only if $VarX_1^{-1}$ exists, which is quite a restrictive assumption.
Update 2 If you have a sample, then to calculate the standard deviation, simply plug sample moments into the formula. So for sample $X_1,...,X_n$, the estimate of harmonic mean is
\begin{align} \hat{H}=\frac{1}{\frac{1}{n}\sum_{i=1}^n\frac{1}{X_i}} \end{align}
the sample moments $EX_1^{-1}$ and $Var(X_1^{-1})$ respectively are:
\begin{align} \hat{\mu}_{R}&=\frac{1}{n}\sum_{i=1}^n\frac{1}{X_i}\\\\ \hat{\sigma}_{R}^2&=\frac{1}{n}\sum_{i=1}^n\left(\frac{1}{X_i}-\mu_R\right)^2 \end{align}
here $R$ stands for reciprocal.
Finally the approximate formula for standard deviation of $\hat{H}$ is
\begin{align*} sd(\hat{H})=\sqrt{\frac{\hat{\sigma}_R^2}{n\hat{\mu}_R^4}} \end{align*}
I ran some Monte-Carlo simulations for random variables uniformly distributed in interval $[2,3]$. Here is the code:
hm <- function(x)1/mean(1/x)
sdhm <- function(x)sqrt((mean(1/x))^(-4)*var(1/x)/length(x))
n<-1000
nn <- c(10,30,50,100,500,1000,5000,10000)
N<-1000
mc<-foreach(n=nn,.combine=rbind) %do% {
rr <- matrix(runif(n*N,min=2,max=3),nrow=N)
c(n,mean(apply(rr,1,sdhm)),sd(apply(rr,1,sdhm)),sd(apply(rr,1,hm)))
}
colnames(mc) <- c("n","DeltaSD","sdDeltaSD","trueSD")
> mc
n DeltaSD sdDeltaSD trueSD
result.1 10 0.089879211 1.528423e-02 0.091677622
result.2 30 0.052870477 4.629262e-03 0.051738941
result.3 50 0.040915607 2.705137e-03 0.040257673
result.4 100 0.029017031 1.407511e-03 0.028284458
result.5 500 0.012959582 2.750145e-04 0.013200580
result.6 1000 0.009139193 1.357630e-04 0.009115592
result.7 5000 0.004094048 2.685633e-05 0.004070593
result.8 10000 0.002894254 1.339128e-05 0.002964259
I simulated N
samples of n
sized sample. For each n
sized sample I calculated estimate of standard estimation (function sdhm
). Then I compare the mean and standard deviation of these estimates with the sample standard deviation of harmonic mean estimated for each sample, which supposably should be the true standard deviation of harmonic mean.
As you can see the results are quite good even for moderate sample sizes. Of course uniform distribution is a very well behaved one, so it is not surprising that results are good. I'll leave for someone else to investigate the behaviour for other distributions, the code is very easy to adapt.
Note: In previous version of this answer there was an error in the result of delta method, incorrect variance.
The F-measure is often used in the natural language recognition field for means of evaluation. In particular, the F-measure was employed by the Message Understanding Conference (MUC), in order to evaluate named entity recognition (NER) tasks. Directly quoted from A survey of named entity recognition and classification written by D. Nadeau:
The harmonic mean of two numbers is never higher than the geometrical mean. It also tends towards the least number, minimizing the impact of large outliers and maximizing the impact of small ones. The F-measure therefore tends to privilege balanced systems.
Best Answer
Just as the geometric mean of anything and $0$ is $0$, it is usually natural to define the harmonic mean of anything and $0$ to be $0$.
One physical interpretation of the harmonic mean is that if you have resistors in parallel, the total resistance is as though each resistor had the harmonic mean resistance. If one of the resistors has no resistance, there is no resistance over all (a short), and this is the same as if all resistors had no resistance.
If for some reason you are considering the harmonic means of numbers so that some are negative and some are positive, then it might be better to say that a harmonic mean of $0$ with itself is not defined. However, in the applications I know for the harmonic mean, it is used on nonnegative numbers.