MATLAB: How to produce a collapsed dendrogram plot while retaining cluster membership in the Statistics Toolbox 7.0 (R2008a)

dendrogramStatistics and Machine Learning Toolbox

I would like to have an option where I can collapse a Dendrogram below a specified score. The available method in MATLAB (which limits the numbers of branches) is not the same as it collapse the branches without printing the names of the targets.
For example, when I execute the following code,
X = rand(30,2);
Y = pdist(X,'cityblock');
Z = linkage(Y,'average');
[H1,T1] = dendrogram(Z, 0, 'colorthreshold','default');
the dendrogram is very cluttered. I would like to eliminate this clutter by collapsing smaller clusters. However, when I use the following syntax to call Dendrogram, I lose cluster membership information,
[H1,T1] = dendrogram(Z, 4, 'colorthreshold','default');

Best Answer

The ability to create a collapsed dendrogram plot with cluster membership is not available with the DENDROGRAM function in Statistics Toolbox 7.0 (R2008a). However, you can create such a dendrogram by manually collapsing the smaller clusters as demonstrated in the following example.
Save the attached function 'dendrogram_collapsed.m' to your MATLAB path and then call it using the following code. This example collapses the dendrogram to create 4 clusters.
P = 4;
[H,T] = dendrogram_collapsed(Z, P, 'colorthreshold','default');
The function may also be called with an additional threshold parameter. However, it cannot be used in the place of P above. If the threshold is not chosen appropriately, it may result in incorrect cluster information.
[H,T] = dendrogram_collapsed(Z, 4, 'colorthreshold','default', 'threshold', .5);
Alternatively, you could modify the plot from the second call to dendrogram to include the class membership as a custom data tip on the plot. The attached function 'customdendrogram.m' is another wrapper around Dendrogram which allows you specify a threshold instead of the number of leaf nodes. It then creates a custom data tip so that if you click on the leaves it displays a pop up with all the labels that fall under that cluster. You can run this function using the following example code:
threshold = .3;
[H,T] = customdendrogram(Z, threshold);
Click on the little circles at the leaves of the tree to see the labels.