Solved – Median value on ordinal scales

median

I have been reading about appropriate measures of central tendency for ordinal level data.
So far I have learned that the median and mode can be used but that the latter can only be used in some cases. Some sources state that the median can only be used with Likert questions when there is an odd number of scores. It is not clear to me what this means and also which cases the median cannot be used.

Example:

An example may illustrate.

If there was a question: "Climate change is England’s most serious environmental problem" on a response scale: 1=strongly agree 2=agree 3=unsure 4=disagree 5=strongly disagree. Would the median be 3=unsure?
What if no respondents stated disagree or strongly disagree and all 100 respondents stated either 1, 2, or 3, is the median then 2?
what if respondents only stated 2 or 3. In this case is it not possible to identify the median?

Best Answer

Definitional issues:

The median is the middle value of the data; it is not by definition the middle value of the scale.
When the sample size is even, then the median is the mean of the values either side of middle most point after rank ordering all values (see wikipedia description).

When to use median on ordinal data

In theory the median can be used on data from any variable where the values can be ordered.
In practice, the median is often not the most useful summary of central tendency with ordinal variables. This partially depends on what you want to get out of your measure of central tendency. When you are describing the central tendency of data on an ordinal variable with only a small number of response options (i.e., perhaps less than 20 or 50 or 100), the median can be quite gross (e.g., 1,1,3,3,3 and 1,3,3,5,5 both have a median of 3, but the second example would have a higher mean). When it comes to summarising the central tendency of Likert items, I find the mean to be much more useful and sensitive to meaningful differences. Ordinal variables that are ranks do not suffer from this problem of "grossness".
Interpolated medians are another way of overcoming the gross nature of the median on ordinal data with few values.

Related Solutions

Central Tendency – Should the Mean Be Used When Data Are Skewed?

I disagree with the advice as a flat out rule. (It's not common to all books.)

The issues are more subtle.

If you're actually interested in making inference about the population mean, the sample mean is at least an unbiased estimator of it, and has a number of other advantages. In fact, see the Gauss-Markov theorem - it's best linear unbiased.

If your variables are heavily skew, the problem comes with 'linear' - in some situations, all linear estimators may be bad, so the best of them may still be unattractive, so an estimator of the mean which is not-linear may be better, but it would require knowing something (or even quite a lot) about the distribution. We don't always have that luxury.

If you're not necessarily interested in inference relating to a population mean ("what's a typical age?", say or whether there's a more general location shift from one population to another, which might be phrased in terms of any location, or even of a test of one variable being stochastically larger than another), then casting that in terms of the population mean is either not necessary or likely counterproductive (in the last case).

So I think it comes down to thinking about:

what are your actual questions? Is population mean even a good thing to be asking about in this situation?
what is the best way to answer the question given the situation (skewness in this case)? Is using sample means the best approach to answering our questions of interest?

It may be that you have questions not directly about population means, but nevertheless sample means are a good way to look at those questions (estimating the population median of a waiting time that you assume to be distributed as ab exponential random variable, for example is better estimated as a particular fraction of the sample mean) ... or vice versa - the question might be about population means but sample means might not be the best way to answer that question.

Solved – When would we use tantiles and the medial, rather than quantiles and the median

This is really a comment, but too long for a comment. It is trying to clarify the definition of "tantile" (in the $p=0.5$ case which is analogous to the median). Let $X$ be a (for simplicity) absolutely continuous random variable with density function $f(x)$. We assume that the expectation $\mu= \mathbb E X$ does exist, that is the integral $\mu=\int_{-\infty}^\infty x f(x)\; dx $ converges. Define, analogously with the cumulative distribution function, a "cumulative expectation function" (I have never seen such a concept, does it have an official name?) by $$ G(t) = \int_{-\infty}^t x f(x) \; dx $$ Then the "tantile" is the solution $t^*$ of the equation $G(t^*) = \mu/2$.

Is this interpretation correct? Is this what was intended?

To return to the original question, in the context of an income distribution, the tantile is the value of income such that half of total income is for people with above that income, and half of total income is for people with below that income.

EDIT

These quantities ( function $G(t)$ above) are related to various risk measures used in some financial literature, such as "expected shortfall".

Have a look at the paper A J Ostaszewski & M B Gietzmann: "Value Creation with Dye's Disclosure Option: Optimal Risk-Shielding with an Upper Tailed Disclosure Strategy" (may 2006), especially around page 15, where they define something they call "Hemi-mean" which is related to $G(t)$ above, also "expected shortfall relative to $t$ and also known as $first lower partial moment". It would be interesting to look into these connections ...

Another term used for this idea is "partial expectation". See for instance https://math.stackexchange.com/questions/1080530/the-partial-expectation-mathbbex-xk-for-an-alpha-stable-distributed-r and use google!

Also, the book Kotz & Kleiber:"Statistical Size Distributions in Economics and Actuarial Science" give relevant information, on page 22 they define (Here $X>0$) $$ F_k(x) = \frac1{E X^k} \int_0^x t^k f(t)\; dt $$ which is "the $k$th-moment distribution", note that $G(t)=\mu F_1(t)$ so is basically the first-moment distribution. They refer to Champernowne (1974) who calls $F_1$ the "income curve", and denotes the underlying cdf $F$ by $F_0$. In terms of the first moment distribution the Lorenz curve can be given as $$ \{(u, L(u))\} = \{(u,v)\colon u=F(x),v=F_1(x); x\ge 0\} $$