Solved – What percentage of the students scored more than one standard deviation above the mean

standard deviation

I was given a question for an assignment but I don't understand whether or not I have the right answer…

The question is this-

What percentage of the students scored more than one standard deviation above the mean?

I was given a data set of 50 scores of students in a statistics course and calculated the following using minitab.

  • Mean- 62.46
  • Median- 61.00
  • Variance 237.76

The standard deviation calculated was 5.7035 as I took the square root of the variance.

The score at one standard deviation above the mean would be 68.1635

Is my answer supposed to be 15.8%? As when looking at a symmetrical distribution curve we can see that one standard deviation is 34.1% so I took the next three percentages and added them to find the percent

13.6 + 2.1 + 0.1 = 15.8%

Or am I suppose to use 68.1635 to figure out the percentage?

enter image description here

Best Answer

Why are you using the normality assumption? You do not know the distribution of scores in the sample. So, given a dataset (let us denote it with s, a vector of the student scores), the following routine will give you the exact result for any distribution (below is the implementation in R):

$$ \boldsymbol{s} = (s_1, \ldots, s_n), \quad\mathrm{ans} = \frac{\#\left\{s_i\colon s_i > \left( \bar{\boldsymbol{s}} + \sqrt{\frac{1}{n-1} (\boldsymbol{s} - \bar{\boldsymbol{s}})' (\boldsymbol{s} - \bar{\boldsymbol{s}}}) \right)\right\}}{n} \cdot 100\% $$ where $\bar{\boldsymbol{s}} = \frac{1}{n} \sum s_i$ is the arithmetic mean and $\#\{\cdot\}$ just counts the elements of a set that satisfy the condition.

sum(s > mean(s) + sd(s)) / length(s) * 100

This thing does exactly what it says on the tin: s > mean(s) + sd(s) returns TRUE for those guys who were above one SD, sum counts them (TRUE is converted to 1 and FALSE to 0), and then you compute the percentage.

Related Question