After reviewing the literature that I saw I don't think there's any legitimate sense in which the standard the deviation is (b-a)/6. The fixed denominator is because the PERT estimate is calculated as a weighted sum of 3 points, repeating one of the points 4 times.
$$
d = \frac{a + 4b + c}{6}.
$$
So in a sense, it seems like they're acting like they have 6 points. But the numerator on that SD is quite magical to me.
edit
Ah, so my denominator guess was completely wrong (I was wondering why it wasn't $\sqrt{6}$ anyway). So they are "backing into" the notion of standard deviation by saying the distance from optimistic to pessimistic should be $6\sigma$. Then as seen in the snippet above they're treating each path as a sum of random variables and getting the standard deviation along the path by taking $\sqrt{\sigma_1^2 + \dots + \sigma_k^2}$. It's more reasonable than I initially gave it credit.
Question: Usefulness of standard deviation/alternatives for highly variable measurements?
Standard deviation will tell you whether or not the measurements are highly variable, it's not that you use "standard deviation" to predict the weather, it's that you use standard deviation to tell you if the other value (for which the standard deviation is provided) can be relied on as a predictor.
Even that alone is no guarantee. Example: It rained on this date 100% for the past 100 years, will it rain today? Answer: There's a good chance, but if there are no clouds in the sky there's 0% chance. The standard deviation of a single value is not the certainty of a result.
A simple example is provided on J. Smith of SNU's webpage on standard deviation:
"Everybody knows that when it comes to climate and weather, there really is no difference between Oklahoma and Hawaii. What?!?!?! You mean you don't believe me? Well, let's look at the statistics (after all, this is a stat course). The average (mean) daily temperature in Hawaii is 78 degrees farenheit. The average daily temperature in Oklahoma is 77 degrees farenheit. You see...no difference.
You still don't buy it huh? Well you are indeed smarter than you look. But how about those numbers? Are they wrong? Nope, the numbers are fine. But what we learn here is that our measures of central tendency (mean, median and mode) are not always enough to give us a complete picture of a distribution. We need more information to distinguish the difference.
Well before we go any further, let me ask a question: Which average temperature more accurately describes that state? Is 78 degrees more accurate of Hawaii than 77 degrees is of Oklahoma? Well if you live in Oklahoma I suspect you decided that 77 degrees is a fairly meaningless number when it comes to describing the climate here.
...
Okay...so the mean temperatures were 78 for Hawaii and 77 for Oklahoma...right? But notice the difference in standard deviation. Hawaii is a mere 2.52 while Oklahoma came in at 10.57. What does this mean you ask? Well the standard deviation tells us the standard amount that the distribution deviates from the average. The higher the standard deviation, the more varied that distribution is. And the more varied a distribution, the less meaningful the mean. You see in Oklahoma, the standard deviation for temperature is higher. This means that our temperatures are much more varied. And because the temperature varies so much, the average of 77 doesn't really mean much. But look at Hawaii. There the standard deviation is very low. This of course means the temperature there does not vary much. And as a result the average of 78 degrees is much more descriptive of the Hawaiin climate. I wonder if that has anything to do with why people want to vacation in Hawaii rather than Oklahoma?
From: "Probabilistic Forecasting - A Primer" by Chuck Doswell and Harold Brooks of the National Severe Storms Laboratory Norman, Oklahoma:
"Probabilistic forecasts can take on a variety of structures. As shown in Fig. 0, it might be possible to forecast Q as a probability distribution. [Subject to the constraint that the area under the distribution always sums to unity (or 100 percent), which has not been done for the schematic figure.] The distribution can be narrow when one is relatively confident in a particular Q-value, or wide when one's certainty is relatively low. It can be skewed such that values on one side of the central peak are more likely than those on the other side, or it can even be bimodal [as with a strong quasistationary front in the vicinity when forecasting temperature]. It might be possible to make probabilistic forecasts of going past certain important threshold values of Q. Probabilistic forecasts don't all have to look like PoPs! When forecasting for an area, it is quite likely that forecast probabilities might vary from place to place, even within a single metropolitan area.".
Question: However is standard deviation only useful/make sense for normal distributions?
All that standard deviation will tell you about "highly variable measurements" is that they are highly variable, but you knew that already; if the standard deviation is very low you can rely more, but not absolutely, on historical measurements.
As a sidequestion: would the mean value be more accurate, with lower coefficient of variation if one has one million or billion years of measurements of data, even when each data point (spread) is highly variable?
Q: Mean more accurate with more data points?: Yes.
Q: Lower variation (standard deviation)?: No, not if the "data point (spread) is highly variable".
The "standard deviation" doesn't affect the accuracy of your calculation of the mean, regardless of the standard deviation you have equal mathematical skills and calculate both the mean and standard deviation equally well. It's that with a standard deviation (accurately calculated) the mean (or any other value) has less meaning when the standard deviation is large. It's a less useful predictor.
With a very low standard deviation any prediction based on a single value (for example, the mean) isn't 100% reliable.
Question: Looking for answers which preferably are relevant to above example. Links to relevant studies are highly appreciated. Answers/research that provide intuitive examples/explanations are also highly appreciated. Of course answers to the other questions also are appreciated.
- Understanding the difference between climatological probability and climate probability
- Bayesian probability
"Bayesian probability is an interpretation of the concept of probability, in which, instead of frequency or propensity of some phenomenon, probability is interpreted as reasonable expectation representing a state of knowledge or as quantification of a personal belief.
The Bayesian interpretation of probability can be seen as an extension of propositional logic that enables reasoning with hypotheses, i.e., the propositions whose truth or falsity is uncertain. In the Bayesian view, a probability is assigned to a hypothesis, whereas under frequentist inference, a hypothesis is typically tested without being assigned a probability.
Bayesian probability belongs to the category of evidential probabilities; to evaluate the probability of a hypothesis, the Bayesian probabilist specifies some prior probability, which is then updated to a posterior probability in the light of new, relevant data (evidence). The Bayesian interpretation provides a standard set of procedures and formulae to perform this calculation.".
- Modern Forecasting Papers
That should get you started, each of those papers has citation links which lead to newer papers.
Best Answer
I think of it as a relative measure of spread or variability in the data. If you think of the statement, "The standard deviation is 2.4" it really tells you nothing without respect to the mean (and thus the unit of measure, I suppose). If the mean is equal to 104, the standard deviation of 2.4 communicates quite a different picture of spread than if the mean were 25,452 with a standard deviation of 2.4..
The same reason you normalize data (subtract the mean and divide by the standard deviation) to place data expressed in different units on a comparable or equal footing—so too this measure of variability is normalized—to aid in comparisons.