Can anyone explain me in simple english why does mean and median play a important role on Data mining. Actually finding a mean and median is of what use?
And many people say's median is better at times than mean, why so? And I'm a novice in data mining so it will be really good if the answer is in simple terms.
Solved – How does mean and median play an important role in data mining
meanmedian
Related Question
- Solved – Calculating means and medians for 2d clusters
- Solved – For persons aged 25 and over in the US would the average or the median be higher for income
- Median – Is There More Than One Median Formula? Exploring Different Types of Averages
- Kruskal Wallis Test – Interpretation of Kruskal Wallis Test Results
Best Answer
Often you want to reduce multiple measurements of something to one value, because thats easier to handle and understand than the complete distribution, and you are o.k. with the information loss. So you take some "average" that should be representative of the distribution of the values. This "average" should be in the "middle" of the distribution. There are many ways to calculate the "middle", and Wikipedia lists some of them.
The median is exactly that: 50% of the values are smaller, 50% are larger, so the values of the actual measurements don't matter (only their rank) which makes it robust against skews in the distribution.
The arithmetic mean is the sum of all the values divided by their number. The numbers matter, it shifts with the distribution.
Which one is "better" depends on your application. Most people prefer a "robust" estimator, because they don't know the underlying distribution and want to be on the safe side.