[Math] Median of Medians in 2D Array/Matrix

MATLABmatricesmedian

This is a bit of a mathematics problem, and a MATLAB problem.

In MATLAB, if I call median(M), where M is an m x n 2D matrix, I get a list of n values which represent the median value for each of the n columns.

I'm trying to derive a proof (for my own curiosity) for:

  • Let W be a 1 x n vector that corresponds to the median values for each column in the m x n matrix, M
  • Let W = median(M)
  • Prove whether or not whether the median of W is always equal to the median value of all values of M
    • In MATLAB parlance, this would be akin to, "is median(median(M)) the true median of M if M were normalized to a 1 x (nm) vector, V, and we calculated median(V).

Is there an existing proof or general rule that dictates whether or not the median of a set of data is equal to the median of medians of any arbitrary subsets? Due to linearity, I can assume this to be true for the average (ie: arithmetic mean), but I'm not so sure about the mode and median operations.

Thank you.

Best Answer

It's not the same. To prove it, a counterexample suffices. There are two cases:

  • Even number of rows/columns: Matlab's definition of median implies an interpolation between the two central values.

    >> median(median([1 2; 3 40]))
    ans =
       11.5000
    
    >> median([1 2 3 40])
    ans =
       2.5000
    
  • Odd number of rows/columns: no interpolation needed; the median is just the central value:

    >> median(median([1 2 30; 4 5 60; 10 20 30]))
    ans =
         5
    
    >> median([1 2 30 4 5 60 10 20 30])
    ans =
        10
    
Related Question