Solved – Faster R-CNN: How to avoid multiple detection in same area

conv-neural-networkneural networksobject detectiontensorflow

I use the Tensorflow object detection API to train the Pascal VOC dataset from scratch. I just had a look on the first results after 200k training steps and the results are okay, despite that I often have many detections of the same class in Overlapping regions. For example consider the following detections (ignore the wrong person detection in the first image.

Multiple detection of the same motorcycle
Multiple detections of the same aeroplane

Is there a general way to avoid such multiple detections of the same object? I guess this is caused by overlapping Region proposals for which the Detection network predicts objects that fit the groundtruth data above the 0.7 IoU threshold, so maybe it would help to set this threshold a bit higher?

Btw. I am using a Faster R-CNN Resnet 101 Architecture.

Edit:

I get a mAP of 0.3 on the whole model.

Best Answer

This is a common property of object detectors such as Faster R-CNN: They predict every object several times. It is the job of a Non-maximum suppression function to filter out the duplicates. Loosely explained, the NMS takes couples of overlapping boxes having equal class, and if their overlap is greater than some threshold, only the one with higher probability is kept. This procedure continues until there are no more boxes with sufficient overlap. This minimum overlap ratio is one of the hyperparameters you can tune.

The second hyperparameter you can tune is the threshold for the class probability (e.g. 70%). All the objects predicted with lower probability are simply ignored.

Tuning these two hyperparameters should give you a satisfactory prediction quality.

Related Question