Solved – When would we use tantiles and the medial, rather than quantiles and the median

descriptive statisticsmeanmedianpartial-momentsquantiles

I can't find definitions for either tantile or medial on Wikipedia or Wolfram Mathworld, but the following explanation is given in Bílková, D. and Mala, I. (2012), "Application of the L-moment method when modelling the income distribution in the Czech Republic", Austrian Journal of Statistics, 41 (2), 125–132.

The medial is the value of a $50\%$ (sample) tantile just as the sample median equals the value of a $50\%$ sample quantile. Sample tantiles as well as sample quantiles are based on an ordered sample. First of all, cumulative sums of observations in the ordered sample are evaluated. Then, for a given percentage $p$, $0<p<100$, a $p\%$ tantile is defined as the value of the analysed variable that divides all observations in the ordered sample into two parts: the sum of smaller or equal observations is $p\%$ of the total sum of observations and the sum of observations that are greater represents the residual $(100-p)\%$ of this sum.

When does it make sense to use these as measures of location, rather than the more conventional median or other quantiles? One possible situation, household incomes, is given in that paper:

It can be derived from this definition that the medial can be used as a reasonable characteristic of the level of income, since households with the income lower or equal to the medial receive one half of the total income in the sample, those with the income higher than the medial receiving the other half.

In this case, the median household income was found to be CZK 117,497 (i.e. half of households earned more than this and half earned above), compared to a medial household income of CZK 133,930 (households with an income above this figure receive one half of total income). Note that this comparison doesn't necessarily reflect the skewness of household incomes, or even its non-uniformity: even if household incomes were uniformly distributed, the medial would still lie above the median. As far as I understand the definition, the medial would only equal the median if all households received the same income.

So is there any particular reason to prefer the medial in this case, or at least to use it as a supplementary measure? What exactly does the comparison between median and medial tell us? It doesn't seem that the medial is directly comparable to other measures of central tendency for the reasons I just noted. Are there any other situations where medial/tantiles are widely used or seen as particularly informative? Practical examples of where they are used, with sample research papers, would be very welcome, and an intuitive idea of the broader context in which they might prove useful would be even better.

It must require totals and subtotals to be meaningful — something which seems relevant with money, and how "the pie" is distributed — but even the act of addition is only meaningful for certain quantities. For intensive rather than extensive properties, such as density or temperature, any sort of summation would not be physically meaningful. It seems to me that an extensive property is necessary but not sufficient for tantiles to be helpful, since I can imagine a shipping analyst interested in what weight of cargo transported is the cut-off so that 50% of all cargo (by weight) is carried in loads of that weight or above, yet I can't imagine an ecologist interested in what length of newt is such that 50% of the total length of all newts is contributed by newts of that length or more.

Best Answer

This is really a comment, but too long for a comment. It is trying to clarify the definition of "tantile" (in the $p=0.5$ case which is analogous to the median). Let $X$ be a (for simplicity) absolutely continuous random variable with density function $f(x)$. We assume that the expectation $\mu= \mathbb E X$ does exist, that is the integral $\mu=\int_{-\infty}^\infty x f(x)\; dx $ converges. Define, analogously with the cumulative distribution function, a "cumulative expectation function" (I have never seen such a concept, does it have an official name?) by $$ G(t) = \int_{-\infty}^t x f(x) \; dx $$ Then the "tantile" is the solution $t^*$ of the equation $G(t^*) = \mu/2$.

Is this interpretation correct? Is this what was intended?

To return to the original question, in the context of an income distribution, the tantile is the value of income such that half of total income is for people with above that income, and half of total income is for people with below that income.

EDIT

These quantities ( function $G(t)$ above) are related to various risk measures used in some financial literature, such as "expected shortfall".

Have a look at the paper A J Ostaszewski & M B Gietzmann: "Value Creation with Dye's Disclosure Option: Optimal Risk-Shielding with an Upper Tailed Disclosure Strategy" (may 2006), especially around page 15, where they define something they call "Hemi-mean" which is related to $G(t)$ above, also "expected shortfall relative to $t$ and also known as $first lower partial moment". It would be interesting to look into these connections ...

Another term used for this idea is "partial expectation". See for instance https://math.stackexchange.com/questions/1080530/the-partial-expectation-mathbbex-xk-for-an-alpha-stable-distributed-r and use google!

Also, the book Kotz & Kleiber:"Statistical Size Distributions in Economics and Actuarial Science" give relevant information, on page 22 they define (Here $X>0$) $$ F_k(x) = \frac1{E X^k} \int_0^x t^k f(t)\; dt $$ which is "the $k$th-moment distribution", note that $G(t)=\mu F_1(t)$ so is basically the first-moment distribution. They refer to Champernowne (1974) who calls $F_1$ the "income curve", and denotes the underlying cdf $F$ by $F_0$. In terms of the first moment distribution the Lorenz curve can be given as $$ \{(u, L(u))\} = \{(u,v)\colon u=F(x),v=F_1(x); x\ge 0\} $$

Related Question