Solved – Best ratio of negative example to create for a multiclass problem

computer visionimage processingmachine learningmulti-class

I am working on a multi-class classification problem using images. I have a training library of images containing 9 different classes of object, however I will also need to train my image classifier to detect negative image examples (i.e. the image contains none of the 9 objects classes in the training library).

Assuming image frequencies in my training library are perfectly balanced, and each class contains n images, what is the best number of negative examples to include in the dataset? Intuition tells me to create n examples for the 'negative' class. Could anyone comment on the appropriateness of this?

Also if anyone could point towards any academic work published on this topic, I would be very grateful.

Thank you.

Best Answer

It is advisable to use equal proportion of instances if your n( number of examples) is a small number or I would say when n is not a good representation of the population. But this practice generally does not give good results when testing it on different set.So it would be better to look for the general trend among negative images and other images, what is their ratio is generally in reality because that might give you a hint of the optimal ratio which should take and if are not sure with what should be the optimal ratio, then you can choose any ratio of negative class and according to that you can assign "class_weight" parameter value while training the model.