MATLAB: Dealing imbalanced data in neural network

imbalanced dataneural network

I want to use deep learning network for classification problem. I have an issue of imbalanced data, means one of the classes have less training examples than the others.
I know there is an option to remove training data from the other classes, but I wonder if there is other solution. For example, is there an option to modify the cost layer such that the cost of miss classification a specific class will be larger? Thanks,

Best Answer

There many ways to deal with unbalanced classes when there is no more real data available. Over the decades I have used the following
1. Use the summary statistics of small classes to simulate more data
2. Design multiple nets using the smaller classes and subsets of the larger classes.
Then combine the answers.
3. Use a cost matrix to enhance the influence of the small subsets
and/or reduce the influence of the larger subsets
4. A combination of the above.
The basis of the techniques can be understood by examining the following term in the Bayesian Risk
Cij * Pi * p(i|x)
which involves the probability density, a prori probability and the classification cost.
Hope this helps.
Thank you for formally accepting my answer
Greg