Solved – Object Localisation without Classification

image processingmachine learning

I have a data set of photos containing an object in each of them. I want to find out the coordinates of rectangle enclosing the object.

Note that each photo contains exactly 1 object (for example, if there is a pair of shoes in the photo it is to be treated as one object), and the photos are taken in a simple white background. But the images do not contain one class of objects, the object can be anything.

I have a training set, consisting of photos, and the coordinates of the rectangle enclosing the object for these photos. And I want to find the coordinates of the enclosing rectangle, given a new photo (exactly 1 object, photos taken in simple white background).

I searched a lot for a method to do so, and found resources for achieving localization with classification, but neither do I want to classify the objects nor do I have class labels in my training set.

I also thought edge detection and object segmentation methods could be useful.

However, I feel that my task is much simpler since I know that I have to localize only 1 object in an image and the background is also simple, so there must be some simple methods I am overlooking.

Any guidance is much appreciated, and I am relatively new to machine learning so I would be grateful for guidance to implement the appropriate technique.

Best Answer

If photos are taken in a simple white background, and the object appearance are pretty distinguishable from the background. You do not really have to do as heavy as deep learning based method.

The task might fall into multiple aspects in computer vision, for example, foreground/background segmentation using Markov Random Field / Conditional Random Field / GraphCut.

If insisting using deep learning method, a look into the saliency detection topic might be helpful. This is a widely studied area with both traditional and deep learning methodology.