I am currently working with a box plot (shown below) that consists of two boxes per value of one of the independent variables (call it $x$). The other independent variable is indicated by the two boxes (call it $y$). The blue box represents the dependent variable (call it $z$) under the condition $y = 1$, while the red box represents $z$ under the condition $y = 2$. The bottom and top whiskers represent the 10th and 90th percentiles respectively, while the bottom and top edges of a box represent the 25th and 75th percentiles respectively. The median is marked in black.

My hypothesis is that, when $y = 1$ (blue box), the empirical CDF of $z$ is to the right of the empirical CDF of $z$ when $y = 2$ (red box) "in general" (on average or otherwise) for every value of $x$. This relationship (although not particularly strong) can be seen in the plot below. However, I am not sure how to phrase this precisely in terms of a statistical test.

One possibility that I thought of was to use a two-sample Kolmogorov-Smirnov test for each value of $x$, but I am not sure how helpful this would be. Another possibility, is that, because the data was generated in pairs, i.e., one specific value of $z$ when $y = 1$ can be matched to another specific value of $z$ when $y = 2$, then I should subtract the value of $z$ when $y = 2$ from the corresponding value of $z$ when $y = 1$, and then check that the values are always (or mostly) positive. Any suggestions would be appreciated.

## Best Answer

Maybe you're interested in whether sample

`y`

stochastically dominates sample`x`

. If so, you might want to look directly at ECDF plots, and do some formal tests.Here are summaries and ECDF plots of two samples.

Because the ECDF of

`y`

(brown) plots to the right of the ECDF of`x`

(blue), and therefore below, it seems the values of`y`

are generally larger than values of`x`

A two-sample Kolmogorov-Smirnov test confirms this with a P-value below 5%. The test statistic $D$ is the maximum vertical distance between the two ECDF plots.

When two samples are not of the same shape (including the same variability), a two-sample Wilcoxon Rank Sum test, is said to be a test of stochastic dominance (rather than of different medians).

Notes:(1) Technically speaking, there are several different types of 'stochastic dominance' with somewhat different definitions. You may be interested in googling that. Perhaps start here.(2) The fictitious samples used in the above discussion were sampled in R as follows: