MATLAB: Euclidean distance-based clustering with predetermined number of members

classificationclusteringMATLABStatistics and Machine Learning Toolbox

Hello to you all

I have a data point that contains points in the 2D coordinate, and I want to cluster these points based on the minimum distance between them to the K group. Each cluster will have a predetermined number of members, for example, five members, like the following picture. Note that remained data points, will be unclustered.

Is there any function at Matlab that help me?

Best Answer

You could easily ? write your own loop to do it. Just start with the closest pair of points and keep assigning nearby neighbors to that cluster until you reach the required number of neighbors in the cluster. Then increment the cluster number of repeat to find the next cluster. Keep going until all points are gone (used up).

clc;    % Clear the command window.
fprintf('Beginning to run %s.m.\n', mfilename);
close all;  % Close all figures (except those of imtool.)
clear;  % Erase all existing variables. Or clearvars if you want.
workspace;  % Make sure the workspace panel is showing.
format long g;
format compact;
fontSize = 22;
numPoints = 200;
pointsPerCluster = 5;
numClusters = ceil(numPoints / pointsPerCluster)
coordinates = zeros(pointsPerCluster, 2, numClusters);
t = table(zeros(numPoints, 1), zeros(numPoints, 1), zeros(numPoints, 1), zeros(numPoints, 1), 'VariableNames', {'ClusterNumber', 'PointNumber', 'x', 'y'});
xy = rand(numPoints, 2);
x = xy(:, 1);
y = xy(:, 2);
subplot(1, 2, 1);
plot(x, y, 'b.', 'MarkerSize', 14);
grid on;
% Get distances of every point to every other point


distances = pdist2(xy, xy);
minDistance = min(distances(distances~=0))
[row, col] = find(distances == minDistance)
currentRow = row(1) % Get first point.

pointer = 1;
for k = 1 : numClusters
	% Get distances of every point to every other point
	distances = pdist2(xy, xy(currentRow, :));
        % Find the closest points.
	minDistances = mink(distances, pointsPerCluster);
	[ia, ib] = ismember(distances, minDistances);
	rows = find(ib);
	% Store these coordinates as cluster #k
	for n = 1 : length(rows)
		t.ClusterNumber(pointer) = k;
		t.PointNumber(pointer) = n;
		t.x(pointer) = xy(rows(n), 1); % Store x value.
		t.y(pointer) = xy(rows(n), 2); % Store y value.
		pointer = pointer + 1;
	end
	if pointer >= numPoints
		break; % Quit when all points are used up
	end
	% Set the current row coordinates to infinity so we know not to consider (use) them again.
	xy(rows, :) = inf;
	% Get new cluster -- new starting point.
	% Get distances of every point to every other point
	distances = pdist2(xy, xy);
	minDistance = min(distances(distances~=0));
	[row, col] = find(distances == minDistance);
	currentRow = row(1); % Get first point.
end
% Show clusters in unique colors
subplot(1, 2, 2);
gscatter(t.x, t.y, t.ClusterNumber);

Be aware that as points in closely located clusters get used up, the points available for remaining clusters will be more spread out. I think that's obvious though, right? For example if there are only 2 clusters in 1-D, if your values are [1,3,5, 61,62,63,64,65, 99,100] then cluster #1 will be [61,62,63,64,65] (tightly grouped) and cluster %2 will be [1,3,5, 99,100] (widely spaced).

Related Solutions

MATLAB: I need x,y coordinates which are randomly generated between 1 to 400 for x coordinates and 1 to 100 for y coordinates, condition is minimum distance between coordinates is 10 and maximum distance is 20, distances should vary from 10 to 20? plz help

change

if min(distances > 8)

    mind = min(distances);
    if mind >= 10 && mind <= 20

MATLAB: Finding Minimum Distance between two points

You don't need pdist2() because you aren't asking for the distance of every point to every other point. You're only asking for the distance from every point to the single point at (0, 0). So you can simply use sqrt()! Try this:

% Create 10 random points.
xy = rand(10, 2);
x = xy(:, 1);
y = xy(:, 2);
plot(x, y, 'bo', 'MarkerSize', 10);
grid on;
% Compute the distance of each of those points from (0, 0)
distances = sqrt(xy(: , 1) .^ 2 + xy(:, 2) .^ 2)
% Find the closest one.
[minDistance, indexOfMin] = min(distances);
closestX = x(indexOfMin);
closestY = y(indexOfMin);
% Mark it with red *
hold on; % Don't blow away existing points.
plot(closestX, closestY, 'r*', 'MarkerSize', 8, 'LineWidth', 2);
% Draw a line from the closest point to (0, 0)
line([0, closestX], [0, closestY], 'LineWidth', 2, 'Color', 'r');

Best Answer

Related Solutions

MATLAB: I need x,y coordinates which are randomly generated between 1 to 400 for x coordinates and 1 to 100 for y coordinates, condition is minimum distance between coordinates is 10 and maximum distance is 20, distances should vary from 10 to 20? plz help

MATLAB: Finding Minimum Distance between two points

Related Question