数据与计算发展前沿 ›› 2023, Vol. 5 ›› Issue (6): 161-172.

CSTR: 32002.14.jfdc.CN10-1649/TP.2023.06.015

doi: 10.11871/jfdc.issn.2096-742X.2023.06.015

• • 上一篇    下一篇

改进的轻量级YOLOv5算法在行人检测的应用

王子元(),王国中*()   

  1. 上海工程技术大学,电子电气工程学院,上海 201620
  • 收稿日期:2022-07-11 出版日期:2023-12-20 发布日期:2023-12-25
  • 通讯作者: 王国中(E-mail: wanggz@sues.edu.cn
  • 作者简介:王子元,上海工程技术大学,电子电气工程学院控制工程专业,硕士研究生,主要研究方向为计算机视觉、深度学习。
    本文中负责论文初稿撰写与实验论证。
    WANG Ziyuan is a master's student of control engineering at the School of Electrical and Electronic Engineering, Shanghai University of Engineering Science. His main research interests are computer vision and deep learning.
    In this paper, he is responsible for the writing of the draft of the paper and the experimental demonstration.
    E-mail: wangziyuansues@qq.com|王国中,上海工程技术大学,电子电气工程学院,教授,博士,主要研究方向为视频编解码、图像处理、机器学习。
    本文中负责制定论文框架,提出修改意见。
    WANG Guozhong. Ph.D., is a professor in the School of Electrical and Electronic Engineering, Shanghai University of Engineering Science. His main research interests are video encoding and decoding, image processing, and machine learning.
    In this paper, he is responsible for formulating the framework and making suggestions for revision.
    E-mail: wanggz@sues.edu.cn
  • 基金资助:
    国家重点研发计划“宽带通信和新型网络”(2019YFB1802702)

Application of Improved Lightweight YOLOv5 Algorithm in Pedestrian Detection

WANG Ziyuan(),WANG Guozhong*()   

  1. School of Electrical and Electronic Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
  • Received:2022-07-11 Online:2023-12-20 Published:2023-12-25

摘要:

【目的】 目前,行人检测算法存在模型复杂、检测精度较低、检测速度慢的问题。为了解决这些问题,将YOLOv5算法进行了改进,能够更好地应用于行人检测。【方法】 首先使用深度可分离卷积替换YOLOv5算法骨干网络中的普通卷积,降低了模型的计算量和参数量,提高模型的检测效率;然后在骨干网络的特征融合部分添加通道注意力和空间注意力机制,让网络关注于图像中行人的位置信息和通道信息;最后使用EIOU损失函数优化训练模型,并使用K-means++聚类算法来生成先验框。【结果】 将改进后的模型在INRIA行人检测数据集上与其他算法进行了对比实验。结果表明,改进后的模型精确度达到89%,相比于原模型提高了7.6%,检测速度达到每秒106帧。【结论】 本文改进算法提高了行人检测的速度和精度,且模型数据量小,易于实时检测和部署。

关键词: 行人检测, 深度学习, YOLOv5, 深度可分离卷积, 注意力机制

Abstract:

[Objective] In this paper, we propose an improved YOLOv5 algorithm to address the problems of the high computational complexity of pedestrian detection algorithms, low detection accuracy, and slow detection speed, which can be better applied to pedestrian detection. [Methods] Firstly, the vanilla convolution in the YOLOv5 backbone network is replaced by the depthwise separable convolution, which reduces the number of calculations and parameters while improving detection accuracy. Then, channel attention and spatial attention are incorporated into the feature fusion part of the backbone network, which can force our network to focus on the location and channel information of pedestrians in the image. Finally, the EIOU loss function is used to optimize the proposed model, and the K-means++ clustering algorithm is used to generate priori boxes. [Results] The results show our proposed model can achieve a detection accuracy of 89%, which is 7.6% higher than the original backbone, and the detection speed reaches 106 frames per second when using the INRIA pedestrian detection dataset. [Conclusions] Our proposed method significantly improves the speed and accuracy of pedestrian detection, has also small parameters and is easier to detect and deploy in real-time.

Key words: pedestrian detection, deep learning, YOLOv5, deep separable convolution, attention mechanism