# Non Max Suppression and Object Detection

Artificial Intelligence Asked by Moe Kaung Kin on August 16, 2020

My understanding on how non max suppression work is suppress all overlapping boxes that are over jaccard overlap threshold (may be 0.5). The boxes to be considered are on confident score (may be 0.2 or something). My knowing is if there is boxes that got score over 0.2 (may be score is 0.3 and overlap is 0.4) the boxes won’t suppress. By this way one objects will be predicted by many boxes,one high score box and many low confident score boxes but I found that the model predict only one box for one object. Can someone enlighten me?
I currently viewing the ssd from https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Object-Detection

Here is the coding..

#Finding Jaccap Overlap and sorting scotes
class_scores, sort_ind = class_scores.sort(dim=0, descending=True)
class_decoded_locs = class_decoded_locs[sort_ind]  # (n_min_score, 4)
overlap = find_jaccard_overlap(class_decoded_locs, class_decoded_locs)
suppress = torch.zeros((n_above_min_score), dtype=torch.uint8).to(device)

for box in range(class_decoded_locs.size(0)):
# If this box is already marked for suppression
if suppress[box] == 1:
continue
suppress = torch.max(suppress, overlap[box] > max_overlap)
suppress[box] = 0

I might be able to help with the theory, but the coding... it is a non standard API such as Tensorflow or Pytorch (it might be custom code for what I can tell).

The key element here is that the bouding boxes are removed only if they hold a prediciton for the same class that the box that is overlapping with (but with less confidence, that is why it gets removed).

Here is an example, where we have:

• Two classes $$c in [c_1, c_2] = [$$ "star"$$,$$ "moon" $$]$$
• Three bounding boxes

The blue bounding boxes holds prediction for the class $$c_1$$ so their predictions are $$p(c_1)_{box1} = 0.8$$ and $$p(c_1)_{box2} = 0.9$$. On the other hand, the green box holds a prediction for the class $$c_2$$.

The three boxes are highly overlaping so the overlap between any box $$x$$ with any box $$y$$ will be above the IoU threshold: $$IoU(box_x, box_y) > 0.5$$. So in principle all boxes are suceptible to be removed.

However the NMS only applies for boxes predicting the same class (in the case the blue one). So the NMS algorihtm is: if the boxes are overlapping, $$IoU(box_1, box_2) > 0.5$$, which is true, remove all non maximal class probability boxes. Said differently, take just the box with highest $$p(c_1)$$ and remove the rest. So the $$box_1$$ with class probability $$p(c_1) = 0.8$$ would be removed.

So what happens with green box? Isn't it overlapping as well? Yes, but consider that the green box is not trying to predict the same object, is trying to predict another object, $$c_2$$, which happens to be very close to the first object, $$c_1$$. This way object detectors support detection of different overlapping objects.

Answered by JVGD on August 16, 2020

## Related Questions

### Is the self-attention matrix softmax output (layer 1) symmetric?

1  Asked on January 5, 2022 by thepacker

### Is there a good website where I can learn about Deep Deterministic Policy Gradient?

1  Asked on January 5, 2022 by huzaifah-shamim

### Why can we perform graph convolution using the standard 2d convolution with $1 times Gamma$ kernels?

0  Asked on January 1, 2022

### Anomaly Detection in distributed system using generated log file

1  Asked on December 30, 2021

### How do big companies, like Facebook, model individuals and their interaction?

1  Asked on December 30, 2021

### How to evaluate the performance of an autoencoder trained on image data?

1  Asked on December 30, 2021 by nim-py

### Is there an optimal way to split the text into small parts when working with co-reference resolution?

0  Asked on December 30, 2021

### Extending patch based image classification into image classification

0  Asked on December 30, 2021

### How to properly optimize shared network between actor and critic?

1  Asked on December 27, 2021 by bestr

### Which is a better form of regularization: lasso (L1) or ridge (L2)?

1  Asked on December 27, 2021 by jaeger6

### What is meant by “arranging the final features of CNN in a grid” and how to do it?

0  Asked on December 27, 2021

### How are training hyperparameters determined for large models?

1  Asked on December 27, 2021 by kao

### How can I have the same input and output shape in an auto-encoder?

2  Asked on December 25, 2021 by vesko-vujovic

### Which neural network should I use to distinguish between different types of defects?

0  Asked on December 25, 2021 by beinando

### Can I think of the graph convolution operation as a regular 2D convolution for images?

0  Asked on December 25, 2021

### How could I use machine learning to detect text and non-text regions in scanned documents?

2  Asked on December 22, 2021

### Using convnet to classify language of text contained in images

1  Asked on December 20, 2021

### Why does my “entropy generation” RNN do so badly?

1  Asked on December 18, 2021

### Continuous state and continuous action Markov decision process time complexity estimate: backward induction VS policy gradient method (RL)

1  Asked on December 16, 2021 by leodongxu

### What is meant by gene, chromosome, population in genetic algorithm in terms of feature selection?

2  Asked on December 16, 2021