Solved – Image Clustering with K-means – Postprocessing

clusteringimage processingk-meanspythonunsupervised learning

I did some clustering on an image (each pixel is an observation that has 5 variables associated with it), I get pretty detailed results but they are a little bit noisey… I think. I used K-means. Does anyone have a nice idea on how to reduce the noise a bit? anyone know some postprocessing for K-means. I would usually just apply a median filter to the image or something of the sort but I want to know if there is something a little nicer out there. Thank you in advanced. Not sure if posting this question here was the correct decision. Let me know.

p.s. this was all done in python by the way, only brightly colored pixels were clustered.

enter image description here

Best Answer

When you did K-means, presumably you treated the attributes at each pixel as a $5$-tuple of real values and you clustered them based on Euclidean distance in $\mathbb{R}^5$. To achieve spatial contiguity in the clustering, include spatial coordinates among the attributes. If you include (say) the two Cartesian map coordinates, you will effectively be doing the K-means clustering in $\mathbb{R}^7 \approx \mathbb{R}^5 \times \mathbb{R}^2$. I have written this as a Cartesian product to emphasize that there is a tuning parameter available to you: the relative sizes of the last two (spatial) attributes compared to the first five attributes. By rescaling the spatial attributes you can vary the amount by which they influence the clustering. With only a little influence, the result will be noisy; with a lot of influence, you will be performing purely spatial clustering. Experiment to identify an optimal value.

(This is an approach I have used many times. It is not always as successful as I would like, but it works sufficiently often to be worth looking at in any case.)

Related Question