Solved – Variance and asymmetry on relative frequency class distribution

frequencynormal distributionstandard deviationvariance

I don't know how to resolve this (easy) exercise. I've calculated the first output. But I don't know if it's correct.

Calculate arithmetic mean, variance (standard deviation^2), concentration and asymmetry

INPUT:

Class           Relative Frequency
10-25           0.48
25-40           0.25
40-60           0.15
60-100          0.1
100-200         0.02

I started finding the middle value of the classes:

Class           Middle value
10-25           17.5
25-40           32.5
40-60           50
60-100          80
100-200         150

The arithmetic mean should be Σ(middle values * relative frequency)

Middle value    Relative Frequency      Weighted value
17.5            0.48                    8.4
32.5            0.25                    8.125
50              0.15                    7.5
80              0.1                     8
150             0.02                    3
                                        SUM = 35.025

Arithmetic mean should be 35.025. Or 35.025 / n (which is 7.005)? I don't know if / n is necessary since the values are already weighted. And the variance? I found a formula which is Σ(middle value^2*relative frequency)-mean^2. It outputs 649.311. Is that correct? Same for asymmetry.. formula is Σ(middle value^3*relative frequency)-mean^3. Is that correct?

Best Answer

Your calculations will provide only estimates, unless observations within classes are distributed uniformly.

You're right in arithmetic mean, as relative frequencies are used, you shouldn't divide by n. It is done when absolute frequencies instead of relative ones are used.

Others are also correct. Variance can also be calculated in the following way, which is more intuitive: $$ \Sigma_{i=1}^k ((middle value_i-\bar x)^2*relativefrequency_i) $$ where $k$ is number of classes.

Showing $\Sigma_{i=1}^k ((mv_i-\bar x)^2*rf_i) = (\Sigma_{i=1}^k mv_i^2*rf_i)-\bar x^2$, $$ \Sigma_{i=1}^k ((mv_i-\bar x)^2*rf_i)=\Sigma_{i=1}^k ((mv_i^2-2*mv_i*\bar x+\bar x^2)*rf_i) $$ $$ =(\Sigma_{i=1}^k mv_i^2*rf_i)-(\Sigma_{i=1}^k 2*mv_i*\bar x *rf_i)+(\Sigma_{i=1}^k \bar x^2*rf_i )$$ now inspecting 2nd term, since you can take constant multipliers out of summation: $$ \Sigma_{i=1}^k 2*mv_i*\bar x *rf_i =2*\bar x *\Sigma_{i=1}^k mv_i*rf_i$$ and recalling $\Sigma_{i=1}^k mv_i*rf_i$ is the formula for mean ($\bar x$): $$\Sigma_{i=1}^k 2*mv_i*\bar x *rf_i = 2*\bar x^2$$ inspecting 3rd term, similarly: $$\Sigma_{i=1}^k \bar x^2*rf_i =\bar x^2* \Sigma_{i=1}^k rf_i$$ and recalling $\Sigma_{i=1}^k rf_i=1$ since adding up relative frequencies of all classes should equal 1 by definition of relative frequency: $$\Sigma_{i=1}^k \bar x^2*rf_i =\bar x^2$$ plugging in relations for 2nd and 3rd term in the original expression: $$(\Sigma_{i=1}^k mv_i^2*rf_i)-2*\bar x^2+\bar x^2=(\Sigma_{i=1}^k mv_i^2*rf_i)-\bar x^2$$

Related Question