Solved – Help with understanding the assumptions of Wilcoxon signed rank test

assumptionstime serieswilcoxon-signed-rank

I am comparing the monthly losses for actual and forecast losses for the same time period to see if our forecasts are in line with actuals. This is for a 12 month period. I am bit confused about the assumptions of the Wilcoxon signed rank test in order to apply this test for my purpose. Our data are not normally distributed and are correlated. Can anyone please help me understand the assumptions below in simple language so I can see if this test can be used for my data?
I am not sure if i understand the assumption #1 about "independently drawn". In some places the assumptions of Wilcoxon sigend rank is also stated as "difference of paired samples should be independent". Some books states that "the Sample should be paired and dependent ". Some articles says that " the Paired differences should be Symmetrical". What are the actual assumptions and what is true ? I also don't understand the assumption about Independence and dependence. What does it actually mean?

Assumptions:

  1. Independence – The Wilcoxon sign test assumes independence, meaning that the paired observations are randomly and independently drawn.

  2. Dependent samples – the two samples need to be dependent observations of the cases. The Wilcoxon sign test assess for differences between a before and after measurement, while accounting for individual differences in the baseline.

  3. Continuous dependent variable – Although the Wilcoxon signed rank test ranks the differences according to their size and is therefore a non-parametric test, it assumes that the measurements are continuous in theoretical nature. To account for the fact that in most cases the dependent variable is binomially distributed, a continuity correction is applied.

Best Answer

  1. The expectation is that there will be dependence within pairs $(x_i,y_i)$, but this is not actually a requirement -- the test will work correctly whether this is true or not. The test is applied to the pair-differences $d_i =y_i-x_i$; if there's positive dependence, taking account of this pairing by taking differences is helpful in reducing variation.

  2. There is assumed to be independence between those differences $d_i$ is independent of $d_j$. This is unlikely to be true of time series.

Continuous dependent variable – Although the Wilcoxon signed rank test ranks the differences according to their size and is therefore a non-parametric test, it assumes that the measurements are continuous

If they're not, the tabled distribution doesn't apply and the test will depend on the pattern of ties.

To account for the fact that in most cases the dependent variable is binomially distributed, a continuity correction is applied.

This makes no sense to me. How would a continuity correction deal with the problem? In large samples you could retain a normal approximation but use a variance that takes account of the pattern of ties, and in smaller samples you'd attempt to compute or simulate from the permutation distributon.

See also the discussion here

Some articles says that " the Paired differences should be Symmetrical".

The signed rank test is a permutation test on the signed ranks (the ranks of the absolute differences), so if we look at it in that way, then for the signs to be exchangeable under the null (in the sense that every rank would be as likely to have come from a positive as a negative difference), it would seem to require symmetry.

(If you don't have symmetry, then it's not generally the case that under the null you could legitimately reallocate the signs like that - for a given rank one sign would typically be more likely than the other.)

Related Question