Solved – When does the sum of the medians = the median of the sum

medianrandom variable

I have two random variables (say x1 and x2) defined by empirical probability distributions, and would like to calculate the median of their sum.

Under what circumstances (in terms of the distributions of x1 and x2) can I assume the median of the sum is equal to the sum of the medians i.e.

median(x1) + median(x2). (1)

The alternative approach I've used is to randomly generate large samples of x1 and x2 and then calculate the median as

median(sample of x1 + sample of x2). (2)

Approach (1) is quicker and I need to do this calculation many times. Under what circumstances is approach 1 approximately correct? Are their alternatives to my second approach?

I've seen this Q&A What does it mean if the median or average of sums is greater than sum of those of addends?

—- Additional information after reading the comments

If we have two normally distributed random variables then median of the sum is approximately the sum of the medians

N1 <- rnorm(10000, mean = 1, sd = 0.1)
N2 <- rnorm(10000, mean = 0)

# We expect an answer of 1 and get close

median(N1) + median(N2) #[1] 0.9918688
median(N1 + N2) #[1] 0.9962555

This doesn't work for exponential variables

set.seed(2002)
e1 <- rexp(100000, 1)
e2 <- rexp(100000, 1)

median(e1) + median(e2) # expect 2* log(2) = 1.386 and get 1.374
median(e1 + e2) # expect 1.678 and get 1.668

So, looking at @glen_b's comment, is symmetry the sufficient condition that would allow the assumption that the median of the sum is the sum of the medians?

Best Answer

Actually my comment is not entirely correct, allow me to clear up;

The median of a series of numbers $X$ is calculated by ordering all the numbers from smallest to largest, then finding the number in the middle. This means that when you change the numbers in $X$ you also change the ordering, hence the median changes. Therefore (in general) you can almost always assume that: $$ \text{MED}(X + Y) \neq \text{MED}(X) + \text{MED}(Y) $$ However there is at least one exception, whenever the ordering of $X$ (after adding $Y$ to $X$) does not change neither does the median. For instance if all numbers in $X$ and $Y$ are the same, see this example (written in R):

set.seed(42)
n <- 100 
x <- rnorm(n)
c <- x
y <- rnorm(n)

median(x+y)           # 0.0767433
median(x) + median(y) # 0.02050838
median(x + c)         # 0.1795935
median(x) + median(c) # 0.1795935