Solved – Differences in sample size when calculating weighted correlations (Pearson, Kendall-tau) in SPSS

correlationkendall-taupearson-rspssweighted-data

Please consider the following dataset in SPSS:

v1  v2  weight
1   3   0,50
2   2   0,50
3   4   1,00
4   5   1,00
5   3   1,00
1   3   2,00
2   2   2,00
3   .   3,00
.   5   1,00
5   4   0,50

When I calculate correlations between v1 and v2 (pairwise missing deletion) with unweighted in SPSS, I get the following results:

Pearsons r: n=8; r=0.525
Kendalls tau: n=8, t=0,334

Suppose now, I'll apply case weights to the data, provided in the "weight" column. I now get the following results:

Pearsons r: n=8,5 (due to weighting); r=0.496
Kendalls tau: n=10 (weighted), t=0,274

My question/problem is that I don't understand why there are different weighted sample sizes for the two correlations.
I understand SPSS weights as case weights, so for both correlations I was expecting a weighted sample size of 8,5, as it is indeed for Pearsons r.

Can anyone help me out here, why there is a different sample size for kendalls tau? If there is some scientific paper/website/whatsoever about that topic, I'm ready to read it on my own, but I didn't find anything useful on my own.

[Edit] Could it be that for Kendall's tau, SPSS is rounding the case weights to natural/whole numbers? In this case, we would indeed end up with n=10. But this then brings me to the question, why SPSS is doing so.

[Edit2] As suggested by Joel W., here's the SPSS syntax to reproduce what I'm doing.

Code for unweighted correlations:

CORRELATIONS
  /VARIABLES=v1 v2
  /PRINT=TWOTAIL NOSIG
  /MISSING=PAIRWISE.
NONPAR CORR
  /VARIABLES=v1 v2
  /PRINT=KENDALL TWOTAIL NOSIG
  /MISSING=PAIRWISE.

Code for weighted correlations:

WEIGHT BY weight.
CORRELATIONS
  /VARIABLES=v1 v2
  /PRINT=TWOTAIL NOSIG
  /MISSING=PAIRWISE.
NONPAR CORR
  /VARIABLES=v1 v2
  /PRINT=KENDALL TWOTAIL NOSIG
  /MISSING=PAIRWISE.

Thx,
deschen2

Best Answer