Solved – Image feature extraction using an Autoencoder combined with PCA

autoencodersdeep learningpcaunsupervised learning

Background: I have fairly large dataset of biomedical images (around 10,000 images) of 1920×1920 pixels (after cropping parts of black borders out). My task is to extract the 200 most important features from the images, to be used in a genome-wide association study.

My initial idea was using a convolutional autoencoder (CAE) for dimensionality reduction but I quickly realized there was no way I could reduce the dimensions to 200 with the encoder and have the decoder reconstruct the images with acceptable accuracy from only those dimensions.

I then had the idea to first train a CAE and use the encoder to create a new dataset of about 30x30x512 flattened to 460800 dimensions from 1920x1920x1 (greyscaled which I'm still debating about if it's acceptable information loss, but it's just so much more convenient to greyscale). On this new dataset I would use PCA to reduce the dimensions to 200.

My questions are:

  1. Does this method make sense? I'm relatively new to machine learning (my background is in mathematical computer science).

  2. If this is a poor method for solving the problem, what would be a better one?

Best Answer

The first question is your task need to extract analytical features? If answer is yes CAE is not the best choice. NN extract features in unsupervised way, because of that is almost always very hard to interpret this features for human.

Second thing what kind of a application of your features you are looking for? If you need some discriminative information you should precise your problem and find what kinds of objects you want to differentiate, or even try to get additional information to transform your problem into supervised domain.

Usually in the medical fields features are extracted based on a strong math or medical ground. That features are easier interpreted by human and could improve medical diagnosis or treatments but only by making suggestions. The point is that consequences of decisions on this area are very seriously so they should be done by medical human specialists.

Main question is if you expect this features to be global or local. If they are local maybe you can crop or you divide images into smaller overlapping windows? Second thing is if you really need so big resolution? If features could be extracted from smaller images you can try resize your images first, which should allow you to decrease number of hidden layer neurons and often reduce impact of small noises which are encoded/decoded by your network.

And last but not least. Your autoencoder will be learning to encode image. If expected features are not directly 'visual' your results could be much worst, for example if your expected feature is number of some objects in the pictures, your autoencoder could disperse this information above whole hidden layer.

If PCA or similiar methods seem to be rational approach since you have to do it in unsupervised way.

(I can't write it as comment because of cross-validated limitation)

Related Question