Steps of getting standard deviation. http://www.techbookreport.com/tutorials/stddev-30-secs.html:
Work out the average (mean value) of your set of numbers
Work out the difference between each number and the mean
Square the differences
Add up the square of all the differences
Divide this by one less than the number of numbers in your set –
this is called the varianceTake the square root of the variance and you've got the standard
deviation
Am I missing out something, or why do we need to square the differences in step 3?
Why not simply do a Abs
(multiply all negative numbers by -1) in step 3?
Also, my second question is why do we need to divide by one less than the number of numbers in the set in step 5? why not simply divide by the number of numbers?
Best Answer
There is. Your alternative formulation of taking the absolute values of the differences instead of squaring them is called the mean absolute deviation (or average absolute deviation).
Both the mean absolute deviation and the standard deviation are used in practice, but much of the reason the standard deviation is more widely used is that it has nicer theoretical properties. For example, the mean and standard deviation are enough to specify which member of the family of normal distributions you are dealing with (edit: although this is convention, as Robert Israel notes in his comment below), and data values $x$ from a normal distribution with mean $\mu$ and standard deviation $\sigma$ can be transformed to data values $z$ from the standard normal distribution via $z = (x - \mu)/\sigma$. Another advantage of the standard deviation, as Robert Israel notes below, is that there is a simple formula for the standard deviation of the sum of independent random variables. (See also the paper referenced below for more on why we use the standard deviation, as well as some arguments in favor of the mean absolute deviation.)
For an answer to your second question, see my answer to "Sample Standard Deviation vs. Population Standard Deviation." In short, if you were calculating the standard deviation of a population rather than a sample, you would divide by the population size $n$. However, when you calculate the standard deviation of a sample, you have to estimate the population mean that would normally be in the formula with the sample mean. Doing so introduces a bias, as the data values tend to be slightly closer to the sample mean than to the population mean (as the sample mean is itself calculated from the data values). It turns out that dividing by $n-1$ rather than $n$ corrects that bias. (Proving that is a standard exercise in beginning statistical theory.)
Going back to your first question, I recent ran across the paper "Revisiting a 90-year-old debate: the advantages of the mean deviation," by Stephen Girard. The paper is worth reading in full, but let me summarize some of his main points.
Reasons for the standard deviation:
Reasons for the mean absolute deviation: