耦合置换注意力和可形变卷积的输电线金具目标检测

A Model Coupled Shuffle Attention and Deformable Convolution for Overhead Transmission Line Hardware Object Detection

  • 摘要: 金具部件的准确识别是架空输电线路安全巡检的一项关键性基础工作。针对金具目标光谱多变、形态复杂等问题,提出一种耦合置换注意力和可形变卷积的YOLOv8模型,用于架空输电线路无人机高分辨率可见光影像金具目标识别。通过置换注意力捕获通道和像素间的依赖关系,增强模型空-谱特征挖掘能力;耦合可形变卷积网络,精细挖掘各类金具目标的复杂空间形态特征,实现金具目标的高精度定位;采用基于动态非单调聚焦机制的边界框损失进行网络模型训练,提高模型的样本学习能力;采用无人机实地采集汕头架空输电线路高分辨率可见光影像作为数据源,构建金具目标数据集进行实验,并与多种先进的金具目标检测模型进行对比。实验结果表明,所提模型金具目标检测性能明显优于现有主流方法,在交并比阈值为0.5时,平均精度达到84.4%,表明所提模型能够快速、高精度识别各类金具目标,保障输电基础设施的安全维护。

     

    Abstract: Objectives: The accurate identification of hardware components is a crucial foundational task for the safety inspection of overhead transmission lines. This paper proposes a novel YOLOv8 model coupling shuffle attention and deformable convolution (YOLOv8-SADC) to address the issues of variable spectral and complex shapes of hardware targets, which is used for overhead transmission line hardware object detection based on unmanned aerial vehicles high-resolution visible images. Methods: The shuffle attention is constructed to capture the dependencies between channel and pixels to enhance spectral-spatial feature mining ability of the model. The deformable convolutional network is used to finely mine the complex spatial morphological features of various hardware targets to achieve high-precision positioning of hardware targets. An intersection over union (IoU) -based loss with a dynamic non-monotonic focusing mechanism based bounding box loss (named Wies-IoU) is adopted for network training to improve sample learning ability of the model. Results: The high-resolution visible light images of Shantou overhead transmission lines were collected by unmanned aerial vehicles (UAVs) as data sources, and the target dataset of fittings was constructed for experiments, and compared with a variety of advanced fittings target detection models. (1) Comparative analysis was carried out on CPLID, 2511 and self-built datasets, and the CPLID dataset included 600 images of normal insulators taken by UAVs; the 2511 dataset takes the blue mat as the background, and shoots four types of fittings from multiple angles: insulators, bolt pins, anti-vibration ham mers, and tension clamps, with a total of 2 511 pictures. The self-built dataset includes 6 types of fittings including connecting fittings, plate fittings, fan-shaped fittings, triangular fittings, short connecting fittings and long fittings, with a total of 516 images and 1 745 labels. Experimental results show that YOLOv8- SADC has excellent performance on CPLID and 2511 datasets, and can fully adapt to scenarios with sim ple backgrounds and single object categories. (2) The comparative experimental results show that the target detection performance of the proposed model is significantly better than that of the existing mainstream algorithms, and the mean average precision (mAP) reaches 84.4% when the IoU threshold is 0.5, indicating that the method can have good robustness to various types of fittings and can effectively detect all kinds of fittings. Among them, the mAP50 of the connect category reached 90.52%, and the mAP50 of the long category reached 93.52%, indicating that the method has good detection ability for all kinds of fittings, especially the rectangular shape of the target fittings. For the sector fittings with small targets and a large num ber of occlusions, the detection accuracy is also improved. (3) There were 116 samples with blue sky background, 646 samples with cloud background, 495 samples with buildings as background, and 488 samples with natural environment as background. There is less interference in the blue sky background, and the target characteristics of the fittings are prominent, and it is easy to extract key information. The natural environment background is mostly trees and grasslands, and the detection accuracy decreases slightly. The cloud texture in the cloudy background may introduce some background interference, resulting in the reduced feature extraction efficiency of the model. There is a certain similarity between the structural characteristics of the building (such as straight lines, edges, etc.) and the shape of the fittings, which can easily lead to false detection. The performance improvement of YOLOv8-SADC varies in different backgrounds, with the natural environment and building background being relatively complex, and the detection accuracy is significantly higher in simple backgrounds (such as blue sky and cloudy background). Conclusions: YOLOv8-SADC, based on high-resolution visible light images captured by drones, enhances the extraction capability of spatio-spectral features through coupled shuffle attention, and introduces deformable convolution to delve deeper into the complex morphological features of metal fasteners. Additionally, a Wise-IoU loss function based on a dynamic non-monotonic focusing mechanism is employed to optimize localization accuracy. Experiments conducted on a drone-acquired dataset along the coastline of Shantou, Guangdong demonstrate that this method significantly improves the detection performance of various metal fastener targets in complex scenes, effectively ensuring the integrity and detail of recognition results. This represents an efficient solution for metal fastener identification.

     

/

返回文章
返回