This is equivalent to a modified version of k-means, where some of the centroids are constrained to take given values. Below, I'll describe how to formulate this as an optimization problem, then how to solve it.
Formulating the problem
Let $X = \{x_1, \dots, x_n\}, x_i \in \mathbb{R}^d$ be a set of data points to cluster and let $\{c_1, \dots, c_k\}, c_i \in \mathbb{R}^d$ denote a set of $k$ centroids. Suppose the first $k' < k$ centroids are already known (e.g. they've been learned using an initial round of k-means clustering). $X$ may or may not include data used to learn this initial clustering.
The goal is to learn the remaining $k-k'$ centroids such that, when each point is assigned to the nearest centroid, the sum of squared point-to-centroid distances is minimized:
$$\min_{c_{k'+1}, \dots, c_k} \ \sum_{j=1}^k \sum_{x \in S_j} \| x - c_j\|^2$$
where $S_j$ is a set containing the points assigned to centroid $j$ (i.e. points for which $c_j$ is the nearest centroid using Euclidean distance). Note that the above is equivalent to the standard k-means problem, with the first $k'$ centroids constrained to take known values.
Solving the problem
The standard (unconstrained) k-means problem is typically solved using Lloyd's algorithm:
- Initialize cluster centroids. For example, choose them to be a random subset of data points. More clever strategies exist as well (e.g. k-means++).
Repeat until assignment of points to centroids doesn't change:
Assign each point to the nearest centroid (using Euclidean distance).
Update each centroid to be the mean of all points assigned to it.
A simple modification of Lloyd's algorithm can be used to solve the modified k-means problem above: In steps 1 and 3, keep the first $k'$ centroids fixed at their given values, and only initialize/update centroids $k'+1$ through $k$
Notes
The solution to the modified k-means problem above will have error at least as large as standard k-means. That is, constraining the first $k'$ centroids can only reduce performance (or, at best, leave it unchanged) compared to re-solving for all $k$ centroids. Whether this is acceptable depends on the underlying problem you're trying to solve.
This is a non-standard problem. It's unlikely that a software implementation already exists, but it should be straightforward to implement it yourself.
Lloyd's algorithm may converge to a local minimum, so it gives only approximate solutions to the k-means problem. Finding an exact solution is NP-hard. A common strategy is to run multiple times from different initial centroids.
Various strategies for accelerating Lloyd's algorithm (e.g. using the triangle inequality to reduce the number of distance computations) should be applicable in the constrained case as well (see references here).
Best Answer
When you did K-means, presumably you treated the attributes at each pixel as a $5$-tuple of real values and you clustered them based on Euclidean distance in $\mathbb{R}^5$. To achieve spatial contiguity in the clustering, include spatial coordinates among the attributes. If you include (say) the two Cartesian map coordinates, you will effectively be doing the K-means clustering in $\mathbb{R}^7 \approx \mathbb{R}^5 \times \mathbb{R}^2$. I have written this as a Cartesian product to emphasize that there is a tuning parameter available to you: the relative sizes of the last two (spatial) attributes compared to the first five attributes. By rescaling the spatial attributes you can vary the amount by which they influence the clustering. With only a little influence, the result will be noisy; with a lot of influence, you will be performing purely spatial clustering. Experiment to identify an optimal value.
(This is an approach I have used many times. It is not always as successful as I would like, but it works sufficiently often to be worth looking at in any case.)