So, I have a data set of percentages like so:
100 / 10000 = 1% (0.01)
2 / 5 = 40% (0.4)
4 / 3 = 133% (1.3)
1000 / 2000 = 50% (0.5)
I want to find the standard deviation of the percentages, but weighted for their data volume. ie, the first and last data points should dominate the calculation.
How do I do that? And is there a simple way to do it in Excel?
Best Answer
The formula for weighted standard deviation is:
$$ \sqrt{ \frac{ \sum_{i=1}^N w_i (x_i - \bar{x}^*)^2 }{ \frac{(M-1)}{M} \sum_{i=1}^N w_i } },$$
where
$N$ is the number of observations.
$M$ is the number of nonzero weights.
$w_i$ are the weights
$x_i$ are the observations.
$\bar{x}^*$ is the weighted mean.
Remember that the formula for weighted mean is:
$$\bar{x}^* = \frac{\sum_{i=1}^N w_i x_i}{\sum_{i=1}^N w_i}.$$
Use the appropriate weights to get the desired result. In your case I would suggest to use $\frac{\mbox{Number of cases in segment}}{\mbox{Total number of cases}}$.
To do this in Excel, you need to calculate the weighted mean first. Then calculate the $(x_i - \bar{x}^*)^2$ in a separate column. The rest must be very easy.