Object detection is a computer vision task that involves predicting the presence of one or more objects, along with their classes and bounding boxes. YOLO (You Only Look Once) is a state of art Object Detector which can perform object detection in real-time with a good accuracy.

The first three YOLO versions have been released in 2016, 2017 and 2018 respectively. However, in 2020, within only a few months of period, three major versions of YOLO have been released named YOLO v4, YOLO v5 and PP-YOLO. The release of YOLO v5 has even made a controversy among the people in Machine Learning community.
Additionally, this has caused a dilemma in the minds of people who are going to start their machine learning projects. In this article, we will discuss the reason for these many new YOLO releases, while emphasizing their originality, authorship, performance and the major improvements, helping people to choose the most appropriate version for their project.
How YOLO evolved
YOLO has been first introduced in 2016 and it was a milestone in Object Detection research due to its capability of detecting objects in real-time with a better accuracy.
It was proposed by Joseph Redmon, a graduate from the University of Washington. The paper describing YOLO won the the OpenCV People’s Choice Award at the Conference on Computer Vision and Pattern Recognition (CVPR) in 2016.
YOLO versions by Joseph Redmon
-
Version 1 ‘You Only Look Once: Unified, Real-Time Object Detection‘ (2016)
-
Version 2 ‘YOLO9000: Better, Faster, Stronger‘ (2017)
-
Version 3 ‘YOLOv3: An Incremental Improvement‘ (2018)
The YOLO v2 can process images at 40–90 FPS while YOLO v3 allows us to easily tradeoff between speed and accuracy, just by changing the model size without any retraining.

Major YOLO implementations
The main implementation of Redmon’s YOLO is based on Darknet, which is an open source neural network framework written in C and CUDA. Darknet sets the underlying architecture of the network and used as the framework for training YOLO. This implementation is introduced by Redmon himself and it is fast, easy to install and supports CPU and GPU computation.
Later, a PyTorch translation for YOLO v3 has been introduced by Glenn Jocher of Ultralytics LLC.
No YOLO updates after v3?
YOLO quickly become famous among the computer vision community due to its sublime speed along with good accuracy. However, in February 2020, Joseph Redmon, the creator of YOLO announced that he has stopped his research in computer vision! He additionally stated that it was due to several concerns regarding the potential negative impact of his work.
This led to some hot community discussions and raised an important question: Will there be any YOLO updates in future?
YOLO v4
Redmon’s withdrawal was not the end of YOLO. Reliving many in the computer vision community, the 4th generation of YOLO has been released in April 2020. It has been introduced in a paper titled ‘YOLOv4: Optimal Speed and Accuracy of Object Detection‘ by Alexey Bochkovskiy et al.
Furthermore, Redmon’s work was continued by Alexey in the fork of the main repository. The YOLO v4 has been considered the fastest and most accurate real-time model for object detection.
Major improvements in YOLO v4
YOLO v4 takes the influence of state of art BoF (bag of freebies) and several BoS (bag of specials). The BoF improve the accuracy of the detector, without increasing the inference time. They only increase the training cost. On the other hand, the BoS increase the inference cost by a small amount however they significantly improve the accuracy of object detection.
YOLO v4 Performance
YOLO v4 also based on the Darknet and has obtained an AP value of 43.5 percent on the COCO dataset along with a real-time speed of 65 FPS on the Tesla V100, beating the fastest and most accurate detectors in terms of both speed and accuracy.
When compared with YOLO v3, the AP and FPS have increased by 10 percent and 12 percent, respectively.

Response of Redmon on YOLO authorship
On 24th April 2020, the Readme file of Redmon’s original github account updated with a links to Alexey’s forked repository and to the YOLO v4 paper. Redmon also tweeted:
YOLO v5
After the release of YOLO v4, within just two months of period, an another version of YOLO has been released called YOLO v5 ! It is by the Glenn Jocher, who already known among the community for creating the popular PyTorch implementation of YOLO v3.
On June 9, 2020, Jocher stated that his YOLO v5 implementation is publicly released and is recommended to use in new projects. However he did not publish a paper to accompany his release, when initially releasing this new version.
Major improvements in YOLO v5
YOLO v5 is different from all other prior releases, as this is a PyTorch implementation rather than a fork from original Darknet. Same as YOLO v4, the YOLO v5 has a CSP backbone and PA-NET neck. The major improvements includes mosaic data augmentation and auto learning bounding box anchors.
Controversy in machine learning community
The release of YOLO v5 has attracted much attention and has caused heated discussions in machine learning community platforms. This was majorly due to several facts on an article published by the Roboflow team regarding the YOLO v5.
That article, titled ‘Yolov5 is Here’ has been published on June 10, 2020 on Roboflow blog, stating several important facts. Followings are some quotes from that blog post by Joseph Nelson and Jacob Solawetz.
"Running a Tesla P100, we saw inference times up to 0.007 seconds per image, meaning 140 frames per second (FPS)! By contrast, YOLO v4 achieved 50 FPS after having been converted to the same Ultralytics PyTorch library."
"YOLO v5 is small. Specifically, a weights file for YOLO v5 is 27 megabytes. Our weights file for YOLO v4 (with Darknet architecture) is 244 megabytes. YOLO v5 is nearly 90 percent smaller than YOLO v4."
So, it said to be that YOLO v5 is extremely fast and lightweight than YOLO v4, while the accuracy is on par with the YOLO v4 benchmark. But the major question raised by the community was: Are these benchmarks accurate and reproducible?
Responses
The author of YOLO v4, Alexey was not happy about how all those comparisons have been made. He has responded to several questions raised in the github, mentioning the issues with those comparisons, specially the batch size.
The Roboflow and YOLO v5 developers also responded positively to the Hacker News community’s questions and on June 14, by publishing an article on Roboflow blog, describing how they compared the two versions.
PP-YOLO
PP-YOLO has been introduced in July 2020, via a paper titled PP-YOLO: An Effective and Efficient Implementation of Object Detector, by Xiang Long et al. It is based on PaddlePaddle (Parallel Distributed Deep Learning), an open source deep learning platform originally developed by Baidu scientists.
Is PP-YOLO a novel model?
PP-YOLO is based on YOLO v3 model. The paper clearly states that the goal of PP-YOLO is to implement an object detector with relatively balanced effectiveness and efficiency that can be directly applied in actual application scenarios, rather than propose a novel detection model.
The notable changes include the replacement of Darknet53 backbone of YOLO v3 with a ResNet backbone and increase of training batch size from 64 to 192 (as mini-batch size of 24 on 8 GPUs).
PP-YOLO Performance
According to the paper, the PP-YOLO can achieve a mAP of 45.2% COCO dataset which exceeds the 43.5% of YOLO v4. When tested on a V100 with batch size = 1, the PP-YOLO can achieve a inference speed of 72.9 FPS, which is also higher than 65 FPS of YOLO v4.
The PP-YOLO authors speculate that the better optimization of tensorRT on ResNet model than Darknet is the main reason behind this performance improvement.

Final words
In this article, we have discussed the important milestones of YOLO’s evolvement and the story behind many new YOLO releases in 2020, while emphasizing on major improvements and performance of these latest YOLO versions. In a summary, the YOLO v4 is the latest Darknet based implementation of this state of art object detector. It also has a paper published with benchmarks by Alexey Bochkovskiy. On the other hand, the YOLO v5 is a new PyTorch implementation by Ultralytics and when tested with larger batch size, it said to have a higher interference speed than most of the detectors. However, at the time of writing this article, there is no peer reviewed paper published for YOLO v5. The PP-YOLO is an another new YOLO upgrade based on a deep learning framework called PaddlePaddle, and it improves the YOLO v3 model to obtain a better balance between effectiveness and efficiency. The facts we discussed like the architecture, improvements and performance on each release will be helpful when selecting the most appropriate YOLO version for a particular project. Keep Learning !
References
[1] Joseph Redmon’s official website.
[2] Published papers of YOLO v1, YOLO v2, YOLO v3, YOLO v4, PP-YOLO.
[3] Github repositories of Original YOLO by Redmon, YOLO v4 by Alexey, YOLO v5 by Jocher and PP-YOLO by Xiang Long.
[4] "YOLOv5 is Here" and "Responding to the Controversy about YOLOv5" blog posts by Joseph Nelson and Jacob Solawetz on Roboflow blog.
[5] "YOLO Is Back! Version 4 Boasts Improved Speed and Accuracy" article by Hecate published on syncedreview.com