A Model Coupled Shuffle Attention and Deformable Convolution for Overhead Transmission Line Hardware Object Detection
-
Abstract
Objectives: The accurate identification of hardware components is a crucial foundational task for the safety inspection of overhead transmission lines. This paper proposes a novel YOLOv8 model coupling shuffle attention and deformable convolution (YOLOv8-SADC) to address the issues of variable spectral and complex shapes of hardware targets, which is used for overhead transmission line hardware object detection based on unmanned aerial vehicles high-resolution visible images. Methods: The shuffle attention is constructed to capture the dependencies between channel and pixels to enhance spectral-spatial feature mining ability of the model. The deformable convolutional network is used to finely mine the complex spatial morphological features of various hardware targets to achieve high-precision positioning of hardware targets. An intersection over union (IoU) -based loss with a dynamic non-monotonic focusing mechanism based bounding box loss (named Wies-IoU) is adopted for network training to improve sample learning ability of the model. Results: The high-resolution visible light images of Shantou overhead transmission lines were collected by unmanned aerial vehicles (UAVs) as data sources, and the target dataset of fittings was constructed for experiments, and compared with a variety of advanced fittings target detection models. (1) Comparative analysis was carried out on CPLID, 2511 and self-built datasets, and the CPLID dataset included 600 images of normal insulators taken by UAVs; the 2511 dataset takes the blue mat as the background, and shoots four types of fittings from multiple angles: insulators, bolt pins, anti-vibration ham mers, and tension clamps, with a total of 2 511 pictures. The self-built dataset includes 6 types of fittings including connecting fittings, plate fittings, fan-shaped fittings, triangular fittings, short connecting fittings and long fittings, with a total of 516 images and 1 745 labels. Experimental results show that YOLOv8- SADC has excellent performance on CPLID and 2511 datasets, and can fully adapt to scenarios with sim ple backgrounds and single object categories. (2) The comparative experimental results show that the target detection performance of the proposed model is significantly better than that of the existing mainstream algorithms, and the mean average precision (mAP) reaches 84.4% when the IoU threshold is 0.5, indicating that the method can have good robustness to various types of fittings and can effectively detect all kinds of fittings. Among them, the mAP50 of the connect category reached 90.52%, and the mAP50 of the long category reached 93.52%, indicating that the method has good detection ability for all kinds of fittings, especially the rectangular shape of the target fittings. For the sector fittings with small targets and a large num ber of occlusions, the detection accuracy is also improved. (3) There were 116 samples with blue sky background, 646 samples with cloud background, 495 samples with buildings as background, and 488 samples with natural environment as background. There is less interference in the blue sky background, and the target characteristics of the fittings are prominent, and it is easy to extract key information. The natural environment background is mostly trees and grasslands, and the detection accuracy decreases slightly. The cloud texture in the cloudy background may introduce some background interference, resulting in the reduced feature extraction efficiency of the model. There is a certain similarity between the structural characteristics of the building (such as straight lines, edges, etc.) and the shape of the fittings, which can easily lead to false detection. The performance improvement of YOLOv8-SADC varies in different backgrounds, with the natural environment and building background being relatively complex, and the detection accuracy is significantly higher in simple backgrounds (such as blue sky and cloudy background). Conclusions: YOLOv8-SADC, based on high-resolution visible light images captured by drones, enhances the extraction capability of spatio-spectral features through coupled shuffle attention, and introduces deformable convolution to delve deeper into the complex morphological features of metal fasteners. Additionally, a Wise-IoU loss function based on a dynamic non-monotonic focusing mechanism is employed to optimize localization accuracy. Experiments conducted on a drone-acquired dataset along the coastline of Shantou, Guangdong demonstrate that this method significantly improves the detection performance of various metal fastener targets in complex scenes, effectively ensuring the integrity and detail of recognition results. This represents an efficient solution for metal fastener identification.
-
-