Method

Advantage

Disadvantage

R-CNN

The network transform the object detection problem into the classification problem and greatly improv the accuracy.

It generate partially overlapping candidate areas from each detection target.

SPP-Net

It introduces the spatial pyramid pooling layer after the last convolution layer, thus repetitive processing is eliminated.

Training is a multi-stage process with long training time.

Fast R-CNN

Its raining and testing are significantly faster than SPP-net. The input image can be any size.

The network still depend on candidate region selection algorithm.

Faster R-CNN

This network is faster than Fast R-CNN and no longer depend on region selection algorithm

The training process is complex, and there is still much room for optimization in the calculation process.

SSD

The multi-scale feature map is adopted and the processing speed is fast.

The robustness of this network to small object detection is not high.

YOLO

The network can meet the real-time requirements with using the full image as Context information.

It is relatively sensitive to the scale of the object, and the effect of small target detection is not good.