Improved YOLOv5 with attention mechanism research on object detection without forms

Junlin Li


Supervised by Charith Perera; Moderated by Alia I Abdelmoty

Deep learning methods have been shown to outperform prior machine learning techniques in multiple domains, including cases that arise in computer vision, such as convolutional neural networks. Object detection has always been a hot area in computer vision.

The YOLO algorithm is based on a 2015 CVPR article by Joseph Redmon called You Only Look Once: Unified, Real-Time Object Detection.YOLO is a more advanced algorithm based on convolutional neural network, which is better than the traditional CNN deep learning network in detection frame rate. For shaped objects,YOLOv5 frame detection framework can play a good ability. However, most objects in life do not have common shapes and colors, which will lead to the detection results of the detection framework do not have ideal accuracy and practicality.

In this paper, I integrate a YOLOv5 framework and the overall framework of the attention technology module.Through image enhancement techniques, we first improve the robustness of the model on specific tasks. Based on assigning different attention weights to different image regions, the model can focus more on important regions, so as to improve the accuracy and performance of object detection. For the detection of objects without a fixed detection shape such as steam, water droplets on the floor, fire, by integrating three different attention mechanism techniques (SENet(squeeze-and-excitation networks)), CBAM(Convolutional block attention module), CoordAtt(Coordinate Attention)), observe and analyze whether the important characteristic factors and key parameters of the original technical framework and the integrated framework are improved, and whether the overall architecture is optimized.

Final Report (10/09/2023) [Zip Archive]

Publication Form