Image Dogtection: A Comparison of Object Detection Algorithms
Noah Baron
Prince of Wales Secondary
Floor Location : J 204 D

I tested which object detection algorithm would be the most accurate. I compared YOLOv2, YOLOv3, SSD, and Faster R-CNN. I used images with varying image quality, angles, lighting, and amount of objects. YOLOv2 and YOLOv3 were compiled on an Acer Aspire E-15 laptop computer. SSD and Faster R-CNN were compiled on Google Collab, a service provided by Google that allows you to run code on their databases which host a Nvidia Tesla K80. I conducted five rounds of testing in my experiment. The first round of images consisted of five stock images of Bernese Mountain Dogs; the second round used stock photos of five different dog breeds; the third round had five mediocre quality images of my pet dog Saphira. The fourth round consisted of poor quality images of Saphira. I originally had five rounds of testing but only displayed four rounds on my results. In the fifth round I tested five images of wolves that look similar to dogs. This round was removed from the results because neither the COCO dataset (the dataset used to train YOLOv2 and YOLOv3) or the OpenImagesv4 dataset (the dataset used to train SSD and Faster R-CNN) were trained on wolves, so it was impossible to recognize them.

It could be concluded that Faster R-CNN performed the best with the highest percentage of objects and images detected correctly. Although Faster R-CNN was the the most accurate, it was also the slowest. While YOLOv2, YOLOv3, and SSD had computation times of approximately 30-40 seconds, Faster R-CNN clocked in times approximately 90 seconds long. Another visible trend was: Faster R-CNN, and SSD detected more objects on average than YOLOv2, and YOLOv3. In the fourth round alone Faster R-CNN recognized on average 4 objects in each photo, SSD averaged 6.2 objects, YOLOv2 averaged 1 object, and YOLOv3 averaged 1.2 objects.

In the future, I would improve my project by running the algorithms on the same GPU, training the algorithms on the same dataset, and removing of the confidence threshold. Doing so, would increase the accuracy of the comparison. In addition, I would test how the confidence threshold affects accuracy and the number of objects detected. The next step for this project would be to produce an app that could be trained on images of household pets making this research more practical.