[Math] How mean change standard deviation

statistics

A college statistics class conducted a survey of how students spend their money. They asked 25 students to estimate how much money they typically spend each week on fast food. They determined that the mean amount spent on fast food is $31.52$ with a standard deviation of $21.60$.
Later they realized that a value entered as $3$ should have been $30$. They recalculate the mean and standard deviation. The mean is now $32.60$.
Which of the following is true about the standard deviation?

  • The standard deviation will increase, because we have increased the value of a data point.

  • The standard deviation will stay the same, because the standard deviation is not affected by a change in a single measurement.

  • The standard deviation will decrease, because this change moved a data point closer to the mean.

Best Answer

Since the question is how the standard deviation changes when changing a data point, and for a fixed number of data points the standard deviation is a differentiable function of the data points, we can solve this answer by looking at the sign of the derivative.

Since all we are interested in is the direction of change and the standard deviation is positive, we can instead look at the square of the standard deviation, also known as variance.

Let's assume wlog that it is data point $x_1$ that is changed. We have $$v = \frac{1}{n}\sum(x_i-\bar x)^2$$ therefore we get by standard derivation rules $$\frac{\mathrm dv}{\mathrm dx_1} = \frac{1}{n}(x_1-\bar x)\left(1-\frac{\mathrm d\bar x}{\mathrm dx_1}\right)$$ Since $$\bar x = \frac{1}{n}\sum x_i$$ we have $$\frac{\mathrm d\bar x}{\mathrm dx_1} = \frac{1}{n}$$ so we finally get $$\frac{\mathrm dv}{\mathrm dx_1} = \frac{n-1}{n^2}(x_1-\bar x)$$ The prefactor is positive, therefore this term is positive if $x_1>\bar x$ and negative if $x_1<\bar x$. That means, the variance, and therefore the standard deviation, grows if a single data point is moved away from the mean, and shrinks if a single data point is moved towards the mean.

Therefore the third bullet point is correct.