I have not found any examples of bootstrap hypothesis tests for the median of differences. Hence, I'd like to suggest my approach. Question: Do you a agree that the reproducible example below would be the correct way of testing the null hypothesis that the median of differences is 0 (against the alternative hypothesis that it is larger than 0)?
In addition, I am trying to relate this to the paper Two Guidelines for Bootstrap Hypothesis Testing. This paper is different to my approach here because instead of computing p-values, it finds critical t-values corresponding to certain significance levels. Nevertheless, it seems that my approach fulfills the first guideline: Resample from $\hat{\theta}^*-\hat{\theta}$ (because of my transformation of the differences d
to d - median(d
) before doing bootstrap samples). However, I don't understand how to incorporate the second guideline: Base the test on the bootstrap distribution of $(\hat{\theta}^*-\hat{\theta}) / \hat{\sigma}^*$. I'd be glad about any hints.
Hypotheses
H0: median(d) = 0
H1: median(d) > 0,
where d = x1 – x2 and the values are assumed to be paired. For illustration, the data sample might look as follows, where for each id
, the corresponding values of x1
and x2
represent a pair.
id x1 x2 d
1 -0.58 -0.62 0.04
2 0.23 0.04 0.19
3 -0.79 -0.91 0.12
4 1.65 0.16 1.49
5 0.38 -0.65 1.03
Explanation of Approach
Transformation: To sample under the H0, I first transform the values of d
by subtracting their median. This ensures that among the transformed values d_H0 = d - median(d)
the H0: median(d) = 0
is true.
Bootstrap sampling: Then, I draw R
bootstrap samples: I sample from d_H0
with replacement and compute the median for each sample, obtaining R
medians of differences.
Computing p-value: The p-value is computed as percentage of cases where the R
medians are larger than median(d)
, the median of the differences in the 1 given data sample. There is a normalization constant added (hence +1
in the numerator and the denominator).
Reproducable Example (in R)
# -------------------------------------------------
# Function to get bootstrapped statistics t_star
# -------------------------------------------------
my_boot = function(d_H0, R){
N = length(d_H0)
t_star = numeric(R)
for (i in 1:R){
t_star[i] = median(sample(d_H0, size = N, replace = TRUE))
}
return(t_star)
}
# -------------------------------------------------
# Generate sample
# -------------------------------------------------
set.seed(1)
x1 = rnorm(100) + 0.05
x2 = rnorm(100)
d = x1 - x2
t = median(d)
# -------------------------------------------------
# Adjust sample to fulfill H0: median(d) = 0
# -------------------------------------------------
d_H0 = d - t
# -------------------------------------------------
# Conduct bootstrap sampling
# -------------------------------------------------
R = 5000
t_star = my_boot(d_H0, R)
# -------------------------------------------------
# Compute p-value
# -------------------------------------------------
p = (sum(t_star > t) + 1) / (R + 1)
p # 0.03
Best Answer
Bonett & Price (2002) proposed a closed-form solution for this, so not requiring bootstrapping. At the very least, you could compare this to your bootstrapping to see how often that the two methods agree.
Forgive me for the verbose code; I did a lot of comments and did it step-by-step for the purposes of being readable.
Reference
Bonett, D. G., & Price, R. M. (2002). Statistical inference for a linear function of medians: Confidence intervals, hypothesis testing, and sample size requirements. Psychological Methods, 7(3), 370–383. doi:10.1037/1082-989x.7.3.370