Hello,
The title might be unclear but I didnt find any better.
– Let's have an array A of two columns filled with cooridnate values Ax and Ay. Those coordinates define points in a scatterplot. I use kmeans with two clusters that give me the coordinates of the two centroides Acx and Acy. Let's have now an array B with similarly two columns of values Bx and By.
I want to find how to separate the points in B in two sets so that the mean of each set is as close as possible of the centroids Axc and Ayc. I started thinking about a way using minimum distance but it seems like de result can be wrong… The result may be a single set if the points are too far from the other centroid. Since I need to repeat the process a huge number of time I'd like it to be as fast as possible.
Any help or clue is welcome. KeFop
[EDIT] Here to explain my point:
close allclear%points with obvious two clusters
Apts = rand(10,2);Bpts = rand(10,2);Bpts(:,1) = Bpts(:,1)+2;Total = cat(1,Apts, Bpts);%adding one point at the edge of the two clusters
Sp = mean(Total)*1.2;Total = cat(1, Total, Sp);%let's add the centroïds, imported from a previous classification, nearby the means
%of the clusters
[~, ClustMean] = kmeans([Total(:,1), Total(:,2)], 2, 'MaxIter', 100, 'Display', 'off','Replicates', 3);centroids = ClustMean;centroids(:,1) = centroids(:,1)+0.2; %Define best set of points of mean as close as possible of the imported
%centroids
for t = 1:length(Total) for c = 1:2 EuclDist(t,c) = sqrt(sum((Total(t,1)- centroids(c,1)).^2 + (Total(t,2)- centroids(c,2)).^2)); end [~, ClustSelect(t)] = min(EuclDist(t,:));endfigurehold all%points with colors depending on cluster
scatter(Total(ClustSelect==1,1), Total(ClustSelect==1,2), 'r');scatter(Total(ClustSelect==2,1), Total(ClustSelect==2,2), 'b');%mean of the two clusters
scatter(ClustMean(1,1), ClustMean(1,2), 60, 'r', 'filled');scatter(ClustMean(2,1), ClustMean(2,2), 60, 'b', 'filled');%centroidsscatter(centroids(1,1), centroids(1,2), 'g', 'd', 'LineWidth', 10);scatter(centroids(2,1), centroids(2,2), 'g', 'd', 'LineWidth', 10);legend('Points of cluster A', 'Points of cluster B','Mean of cluster A','Mean of cluster B', 'Importate centroid', 'Importate centroid', 'Location', 'northeastoutside');
%Now if manually switching the point in the middle from one cluster to the
%other
ClustSelect2 = ClustSelect;if ClustSelect2(end) == 1 ClustSelect2(end) = 2;elseif ClustSelect2(end) == 2 ClustSelect2(end) = 1;end%recalculate the new means
MeanClustA = mean(Total(ClustSelect2==1,:));MeanClustB = mean(Total(ClustSelect2==2,:));figurehold all%points with colors depending on clusterscatter(Total(ClustSelect2==1,1), Total(ClustSelect2==1,2), 'r');scatter(Total(ClustSelect2==2,1), Total(ClustSelect2==2,2), 'b');%mean of the two modified clusters
scatter(MeanClustA(1,1), MeanClustA(1,2), 60, 'r', 'filled');scatter(MeanClustB(1,1), MeanClustB(1,2), 60, 'b', 'filled');%centroidsscatter(centroids(1,1), centroids(1,2), 'g', 'd', 'LineWidth', 10);scatter(centroids(2,1), centroids(2,2), 'g', 'd', 'LineWidth', 10);legend('Points of cluster A', 'Points of cluster B','Mean of cluster A','Mean of cluster B', 'Importate centroid', 'Importate centroid', 'Location', 'northeastoutside');
Best Answer