Real Time Object Detection & Recognition: A Comparative Study of YOLOv3 and YOLOv7 in OpenCV
Main Article Content
Abstract
Real-time object detection is a fundamental task in computer vision, finding applications in various domains such as autonomous vehicles, surveillance systems, robotics, and more. The proposed work presents the design and implementation of a real-time object detection system using OpenCV (Open-Source Computer Vision Library). The system aims to accurately and efficiently detect and localize objects in video streams or captured frames. The proposed work begins with dataset collection and annotation, acquiring a diverse dataset of images with annotated bounding boxes representing objects of interest. The annotated dataset is used for model training and evaluation. Several deep learning algorithms are considered for object detection, including Single Shot MultiBox Detector (SSD), You Only Look Once (YOLO), and Faster R-CNN, and their performance is compared to identify the most suitable approach. Preprocessing techniques like resizing, normalization, and noise reduction are applied to enhance the quality of the input frames. Feature extraction is performed using deep learning models VGG16, which is fine-tuned on the annotated dataset. The selected deep learning model is integrated into the real-time system using OpenCV's functionalities. The system is evaluated using standard metrics like precision, f1 score, recall, and mean average precision (mAP) to assess its detection accuracy. The evaluation is carried out on benchmark datasets and real-world scenarios to gauge the system's robustness and generalization capabilities using two different YOLO models i.e., YOLOv3 and YOLO v7.