MATLAB: Compare values of a matrix

matricesmatrixmatrix array

I have a n*n matrix and want to group the numbers together in different classes. I will explain it with a 5*5 matrix. The upper and lower diagonal values are same while the diagonal matrix is always 1. This is a correlation matrix of pairs where (1,2)=(2,1). I have a threshold of 0.9 so whichever value is greater than 0.9 then both the values are clustered in the same group. I will start with row 1. I want to select the values greater than 0.9 which in this case is only 4th. Now I will group (1,4) in group A. I will also in this same iteration find the least value which should be less than 0.9. Here it is 3rd entry which is 0.75. So for the column 3 I move to row 3 in the second iteration. Now I will repeat the same process but I have to only group remaining columns which are not yet grouped which is 2,3,5 by the same rule. In this iteration only 5th is greater than 0.9 so I have (3,5) as the pair and the least value is (3,1) but since I have to only check 2,3,5 (1,4 are already grouped in A) I will look for minimum out of these two (2,5) which is 2. Now I move to row 2. In my algorithm now 2 is the only last remaining factor I'd look for the maximum value in the second row which is (2,5) and will group this remaining 2 with (2,3,5). I will repeat this process till all the items are grouped in groups A, B, C…. and so on. So to group the last remaining row number I just group them with the best value they have row number.
There would not be any case where a factor is common i.e they are in two groups. For example after (1,4) so when we do iteration for 3rd row then 4th column would not have greater value than 0.9 (most likely) because the factors are from certain image features which associates to each other in a way that since 1 is not associated with 3 very well and good with 4 hence 3 would also be not very well associateed with 4.
Please let me know if you have any questions as I think this may confuse you pretty much. Please let me know if you can let me know a simple algorithm which works for this. Thanks in advance.
My final output group will be (1,4),(2,3,5)
0001 0.88 0.75 0.91 0.79
0.88 0001 0.76 0.74 0.97
0.75 0.76 0001 0.76 0.99
0.91 0.74 0.76 0001 0.80
0.79 0.97 0.99 0.80 0001

Best Answer

I'm still not entirely clear on your whole algorithm, I think the following do what you want:
function clusters = cluster(m, threshold)
columns = 1:size(m, 2); %keep track of column indices remaining. clustered columns are removed from m when they are clustered
clusters = {};
currentrow = 1; %start on 1st row
while ~isempty(m)
tocluster = m(currentrow, :) > threshold; %cluster columns whose value is above threshold on current row. Note that since m(currentrow, currentrow) is 1, it's always included in the cluster
if sum(~tocluster) == 1 %if there's only one column left to cluster afterward
tocluster = true(size(tocluster)); %then include it in the current cluster
end
clusters{end + 1} = columns(tocluster); %#ok<AGROW> %get actual index of columns to cluster
[~, newrow] = min(m(currentrow, :)); %find next row to start clustering from
currentrow = columns(newrow);
m = m(:, ~tocluster); %get rid of columns that have been clustered.
columns = columns(~tocluster);
end
end
I certainly didn't understand what to do when there's only one column/row left to cluster. In the above it's included in the last cluster.