There are two ways in which this can be done:
First, notice that 'pdist' computes one minus the correlations among rows:
>> x
x =
1 2 3 4
2 3 2 3
1 2 3 4
4 3 2 1
>> pdist(x,'cor')
ans =
0.5528 0 2.0000 0.5528 1.4472 2.0000
>> 1-corr(x')
ans =
0 0.5528 0 2.0000
0.5528 0 0.5528 1.4472
0 0.5528 0 2.0000
2.0000 1.4472 2.0000 0
1) The first way is to compute the distance as one minus the absolute correlation, and compute linkage based on that.
>> D = pdist(x,'cor');
>> linkage(D,'single')
ans =
1.0000 3.0000 0
2.0000 5.0000 0.5528
4.0000 6.0000 1.4472
>> C = 1-D
C =
0.4472 1.0000 -1.0000 0.4472 -0.4472 -1.0000
>> D = 1-abs(C)
D =
0.5528 0 0 0.5528 0.5528 0
>> linkage(D,'single')
ans =
3.0000 4.0000 0
1.0000 5.0000 0
2.0000 6.0000 0.5528
Notice that points 1,3,4 are clustered with zero distance, even though the correlation with the point 4 is '-1'.
2) The second way is to write the distance into the 'linkage' command:
>> linkage(x,'single',@(xrow,ymat) 1-abs(corr(xrow',ymat')))
ans =
3.0000 4.0000 0
1.0000 5.0000 0
2.0000 6.0000 0.5528
Best Answer