Solved – Is dimensional reduction using Autoencoders possible with a small sample size

autoencodersclusteringmachine learningneural networks

I have a data set that is not too big but high dimensional, let say 10000 dimensional. I want to use an autoencoder to extract relevant features (clusters) in the data. Usually when I have seen autoencoders being employed, the datasets have been rather big, MNIST has about 60000 training and 10000 test observations.

Let say I want to reduce the dimensionality from 10000 to 100 and had only 100 training samples. Wouldn't this lead to that each of the hidden nodes could represent each of my training samples. And how much bigger would my training data need to be to avoid this kind of trivial representation, 1000, 5000 or 10000? Is there some standard test to see if the features extracted are really relevant?

Thanks.

Best Answer

That sounds like an interesting application of autoencoders. In general you want to have at least as many model parameters as you have training examples. It is possible to train a small network (you could use two fully connected layers for instance) in order to accomplish your task, however don't expect it to do well on unseen data from the "same" data distribution.