Solved – Image classification where images are of different dimensions, resolutions, etc

classificationimage processing

I am working on an image classification problem where I have black and white images of all different dimensions and resolutions. The images belong to 1 of 5 groups including an unknown category. These images are stored sparsely, by which I mean as a list of coordinates.

For example the first three observations for an image might look like this:

X Y
0 0
0 1
2 3
: :

where there are points at (0,0), (0,1), and (2,3). Since the images are black and white, these coordinates correspond to the pixels in the image that are black.

My goal is to classify these images into 1 of 5 groups. I have some metadata that I also plan to use, such as date the image was created, but ultimately I need to find meaningful features from the images themselves. My plan for this was to run PCA and feed the features into SVM or random forests model.

I am somewhat familiar with the classic handwriting digits example, however I am not familiar with cases where images are not of standard dimension or resolution.

One idea that I have had is to divide each image into a fixed number of cells, say a 100×100 grid. This way each image would be described by 1000 values which I could then run PCA on. My concern with this idea is that it reduces the information available for classifying the images.

The Question

Much of the work on image processing that I have seen uses examples where the images are of comparable size, resolution, etc. but how do we develop models for image classification where the images are not standardized in this way?

Best Answer

I would try LBP (or any other descriptor, based on what your images (and task, i.e. classes) actually are). To deal with different sizes, you can use Bag of Words encoding.

Related Question