The F-measure is often used in the natural language recognition field for means of evaluation. In particular, the F-measure was employed by the Message Understanding Conference (MUC), in order to evaluate named entity recognition (NER) tasks.
Directly quoted from A survey of named entity recognition and classification written by D. Nadeau:
The harmonic mean of two numbers is never higher than the geometrical mean. It also tends towards the least number, minimizing the impact of large outliers and maximizing the impact of small ones. The F-measure therefore tends to privilege balanced systems.
To get a better intuition of what is going on, first realize that the ~60% rule that you mention only holds for the normal distribution. Although the variance and the standard deviation have many useful properties for all kinds of statistical models, there is no strong 'interpretation' other than that a higher standard deviation denotes more dispersed data.
In the circular situation, the situation is similar. There is some motivation for using the circular standard deviation, but it is related to one specific model.
So, the interpretation is as follows. If we have a (normal) random variable $X \sim N(\mu, \sigma^2)$, then $X_c = X ~~ \text{mod} ~~ 2 \pi$ has a wrapped Normal distribution $WN(\mu, \rho)$, where $\rho$ is the resultant length and given by $\rho = e^{-\sigma^2 / 2}$. This means we could write the standard deviation of our original random variable as $\sigma = \sqrt{-2 \ln {\rho}},$ which you can of course recognize from the definition of the circular standard deviation.
So, in words, the circular standard deviation is the standard deviation of the normal distribution that, when wrapped, produces a wrapped normal distribution which has the same resultant length as the data set (from which we calculated the circular standard deviation).
So, it still might seem fairly arbitrary, but this is the strongest interpretation of this measure in the circular case.
Best Answer
The harmonic mean $H$ of random variables $X_1,...,X_n$ is defined as
$$H=\frac{1}{\frac{1}{n}\sum_{i=1}^n\frac{1}{X_i}}$$
Taking moments of fractions is a messy business, so instead I would prefer working with the $1/H$. Now
$$\frac{1}{H}=\frac{1}{n}\sum_{i=1}^n\frac{1}{X_i}$$.
Usin central limit theorem we immediately get that
$$\sqrt{n}\left(H^{-1}-EX_1^{-1}\right)\to N(0,VarX_1^{-1})$$
if of course $VarX_1^{-1}<\infty$ and $X_i$ are iid, since we simple work with arithmetic mean of variables $Y_i=X_i^{-1}$.
Now using delta method for function $g(x)=x^{-1}$ we get that
$$\sqrt{n}(H-(EX_1^{-1})^{-1})\to N\left(0, \frac{VarX_1^{-1}}{(EX_1^{-1})^4}\right)$$
This result is asymptotic, but for simple applications it might suffice.
Update As @whuber rightfully points out, simple applications is a misnomer. The central limit theorem holds only if $VarX_1^{-1}$ exists, which is quite a restrictive assumption.
Update 2 If you have a sample, then to calculate the standard deviation, simply plug sample moments into the formula. So for sample $X_1,...,X_n$, the estimate of harmonic mean is
\begin{align} \hat{H}=\frac{1}{\frac{1}{n}\sum_{i=1}^n\frac{1}{X_i}} \end{align}
the sample moments $EX_1^{-1}$ and $Var(X_1^{-1})$ respectively are:
\begin{align} \hat{\mu}_{R}&=\frac{1}{n}\sum_{i=1}^n\frac{1}{X_i}\\\\ \hat{\sigma}_{R}^2&=\frac{1}{n}\sum_{i=1}^n\left(\frac{1}{X_i}-\mu_R\right)^2 \end{align}
here $R$ stands for reciprocal.
Finally the approximate formula for standard deviation of $\hat{H}$ is
\begin{align*} sd(\hat{H})=\sqrt{\frac{\hat{\sigma}_R^2}{n\hat{\mu}_R^4}} \end{align*}
I ran some Monte-Carlo simulations for random variables uniformly distributed in interval $[2,3]$. Here is the code:
I simulated
N
samples ofn
sized sample. For eachn
sized sample I calculated estimate of standard estimation (functionsdhm
). Then I compare the mean and standard deviation of these estimates with the sample standard deviation of harmonic mean estimated for each sample, which supposably should be the true standard deviation of harmonic mean.As you can see the results are quite good even for moderate sample sizes. Of course uniform distribution is a very well behaved one, so it is not surprising that results are good. I'll leave for someone else to investigate the behaviour for other distributions, the code is very easy to adapt.
Note: In previous version of this answer there was an error in the result of delta method, incorrect variance.