Abstract:
Objectives: The rapid development of UAV remote sensing technology is leading to an increased role for these devices in emergency search and rescue operations and law enforcement tracking. This is due to the unique advantages offered by UAVs, including rapid response, wide-area coverage, and a panoramic view. The utilisation of UAV imagery for person detection is of paramount importance in the context of emergency rescue and social security. However, due to the small and complex background of the person objects in the UAV imagery, the high-precision person detection task still presents a significant challenge.
Methods: To address the aforementioned issue, a UAV image person detection network is proposed. The network employs a spatial depth-transformed convolution in lieu of a downsampling layer in the feature extraction stage. This approach ensures the retention of all information in the channel dimensions, preventing information loss and facilitating the preservation of small object features throughout the feature extraction process. Secondly, a selective feature fusion module is designed to address the issue of noise interference. The module employs highlevel features as weights to guide the crucial information in low-level features, thereby mitigating the impact of background noise on the latter. Furthermore, it facilitates a more comprehensive integration of the semantic information of high-level features and the detailed information of low-level features. This integration significantly enhances the network's capacity to distinguish between the background and the foreground information present in UAV images. Finally, a context-aware module is designed to address the issue of limited feature information associated with small objects. This module integrates environmental information surrounding the object, including local features, contextual information, and global context. The module integrates environmental information surrounding the object, including local features, surrounding context, and global context, thus enhancing the contextual features of small objects and improving the final object detection accuracy.
Results: To verify the effectiveness of the proposed method, case experimental analyses are conducted on public datasets. The method achieved an mean Average Precision of 68.9%, with precision at 75.5% and recall at 67.7%. Compared to conventional object detection algorithms, the method showed improvements ranging from 3.1% to 29.5% in mean Average Precision, 0.9% to 8.6% in precision, and 1.0% to 57.9% in recall. These results indicate that the method exhibits a high level of accuracy. Additionally, the generalization and applicability of the method are confirmed through testing on UAV images in various scenarios and under different weather conditions.
Conclusions: The method effectively solves the challenge of detecting small persons in UAV images and significantly improves the detection accuracy. It has a wide range of application prospects in the fields of emergency rescue and social security, and possesses good generalisation ability for UAV image detection tasks in different environments.