井冈山大学学报自然科学版

文章摘要

刘糠继,时培成,齐恒,杨爱喜.基于多分支空洞卷积与自适应特征融合的FCOS目标检测算法[J].井冈山大学自然版,2023,44(6):75-83

基于多分支空洞卷积与自适应特征融合的FCOS目标检测算法

FCOS OBJECT DETECTION ALGORITHM BASED ON MULTI-BRANCH ATROUS CONVOLUTION AND ADAPTIVE FEATURE FUSION

投稿时间：2023-02-11 修订日期：2023-04-20

DOI：10.3969/j.issn.1674-8085.2023.06.010

中文关键词: 目标检测无锚方法主干网络空洞卷积特征融合

英文关键词: object detection anchor free method backbone atrous convolution feature fusion

基金项目:国家自然科学基金面上项目(51575001);安徽省重点研究与开发计划项目(202104a05020003);安徽省自然科学基金项目(2208085MF173)

作者	单位
刘糠继	安徽工程大学机械工程学院, 安徽, 芜湖 241000
时培成	安徽工程大学机械工程学院, 安徽, 芜湖 241000
齐恒	安徽工程大学机械工程学院, 安徽, 芜湖 241000
杨爱喜	浙江大学工程师学院, 浙江, 杭州 310000

摘要点击次数: 596

全文下载次数: 1027

中文摘要:

基于深度学习的目标检测技术在自动驾驶和机器人视觉领域被广泛应用。针对这项任务,FCOS(fully convolutional one-stage object detection)利用全卷积和无锚框方法实现逐像素目标检测,但原始FCOS仍存在图片特征提取不足,全局特征信息获取不充分和特征融合不理想等问题。因此本研究对FCOS进行改进并应用于图像的多目标检测。首先,本研究使用ResNeSt50代替原始主干网络ResNet50,利用特征图注意力和多路径表示相结合的方式来提高主干网络的特征提取能力。然后,基于多分支空洞卷积构建感受野增强模块(RFEM),以获取更全面的全局上下文信息。最后,在原始FCOS特征融合的基础上,本研究设计了自适应重组特征融合模块(ARFFM),高效的融合了高层特征图的语义信息和低层特征图的细节信息。在PASCAL VOC2007数据集上的实验表明,改进后的FCOS达到了81.2%的平均精度均值(mAP),比原始FCOS算法提升了2.9%,并在大多数类别上表现出先进的性能。同时开展了广泛的消融实验,其中ResNeSt50,RFEM,ARFFM模块分别为基线网络带来了1.2%,2.1%,2.9%的收益,这些改进为小目标及遮挡目标的检测提供了一种新的解决方案。

英文摘要:

Object detection technology based on deep learning has been widely used in the fields of autonomous driving and robot vision. For this task, FCOS(fully convolutional one-stage object detection) uses full convolution and anchor-free method to achieve pixel-by-pixel object detection, but the original FCOS still has the problems as insufficient image feature extraction, insufficient global feature information acquisition and unsatisfactory feature fusion. Therefore, this paper improves FCOS and applies it to image multi-object detection.First, this paper uses ResNeSt50 instead of the original backbone ResNet50 to improve the feature extraction capability of the backbone by combining feature-map attention and multi-path representation. Then, a Receptive Field Enhancement Module(RFEM) is constructed based on multi-branch dilated convolutions to obtain more comprehensive global context information. Finally, based on the original FCOS feature fusion, this paper designs an Adaptive Recombination Feature Fusion Module(ARFFM), which efficiently fuses the semantic information of high-level feature maps and the detail information of low-level feature maps. Experiments on the PASCAL VOC2007 dataset show that the improved FCOS achieves a mean precision mean(mAP) of 81.2%, a 2.9%improvement over the original FCOS algorithm, and exhibits state-of-the-art performance on most classes. At the same time, extensive ablation experiments are carried out in this paper, in which the ResNeSt50, RFEM, and ARFFM modules bring 1.2%, 2.1%, and 2.9% of the baseline network respectively, these improvements provide a new solution for the detection of small objects and occluded objects.

查看全文查看/发表评论下载PDF阅读器

关闭