I have 48 samples which are case and control and 27000 features for each sample so my matrix is [48 X 27000]and I am using Deep belief networks(DBN) as my algorithm to predict the accuracy of the datasets. but when I load the datasets in DBN my results are random. When ever I run the DBN on those samples with same parameter values it just gives me different accuracy. Is there any reason behind it can any one tell me the reason. Can I concatenate the same dataset multiple times and rerun.? Is there a way to do like that.?concatenating in the sense adding same data set multiple times like 48 + 48 = 96 samples. If concatenating is possible can any one give me reference paper.
Solved – how to handle small datasets with large dimensions
datasetdeep learningdeep-belief-networksmachine learningneural networks
Related Solutions
As a disclaimer, I work on neural nets in my research, but I generally use relatively small, shallow neural nets rather than the really deep networks at the cutting edge of research you cite in your question. I am not an expert on the quirks and peculiarities of very deep networks and I will defer to someone who is.
First, in principle, there is no reason you need deep neural nets at all. A sufficiently wide neural network with just a single hidden layer can approximate any (reasonable) function given enough training data. There are, however, a few difficulties with using an extremely wide, shallow network. The main issue is that these very wide, shallow networks are very good at memorization, but not so good at generalization. So, if you train the network with every possible input value, a super wide network could eventually memorize the corresponding output value that you want. But that's not useful because for any practical application you won't have every possible input value to train with.
The advantage of multiple layers is that they can learn features at various levels of abstraction. For example, if you train a deep convolutional neural network to classify images, you will find that the first layer will train itself to recognize very basic things like edges, the next layer will train itself to recognize collections of edges such as shapes, the next layer will train itself to recognize collections of shapes like eyes or noses, and the next layer will learn even higher-order features like faces. Multiple layers are much better at generalizing because they learn all the intermediate features between the raw data and the high-level classification.
So that explains why you might use a deep network rather than a very wide but shallow network. But why not a very deep, very wide network? I think the answer there is that you want your network to be as small as possible to produce good results. As you increase the size of the network, you're really just introducing more parameters that your network needs to learn, and hence increasing the chances of overfitting. If you build a very wide, very deep network, you run the chance of each layer just memorizing what you want the output to be, and you end up with a neural network that fails to generalize to new data.
Aside from the specter of overfitting, the wider your network, the longer it will take to train. Deep networks already can be very computationally expensive to train, so there's a strong incentive to make them wide enough that they work well, but no wider.
Best Answer
What you have, is not a problem suitable for DBN. There is no way not to overfit those data. You need to use linear and strongly regularized models. Linear SVM are often used. There is whole chapter in elements of statistical learning about dealing with similar problems, you might check it out (it's for free).
Edit: whole point of MLP and DBN is that they can learn complex, nonlinear features from the data. You don't have enough data, therefore any method that allow complex models, will just overfit.
Another issue is that in such a high dimensional problem, there is always some hyperplane that can separate your classes, therefore there is no need for nonlinear methods.