MATLAB: How to avoid out of memory errors or very long computation times when using LINKAGE

linkagepdistStatistics and Machine Learning Toolboxward

When I try to execute the following code :
load cluster
W=pdist(Z, 'euclidean');
Y=linkage(W, 'ward');
I get a segmentation fault.

Best Answer

The LINKAGE function when called only with ‘ward’ attempts to determine if the distance matrix that it has been given is Euclidean, because if it is not, then the output of LINKAGE is nonsense. That step actually creates several large intermediate matrices, and this causes the out of memory and sometimes segmentation violation issues.
If Ward's linkage is a requirement, then the right way to do it is to use the three input argument form of LINKAGE, i.e.,
Y = linkage(Z,'ward','euclidean');
Therefore your script can be modified to be:
load cluster.mat;
Y = linkage(Z,'ward','euclidean');