Objectives The rapid development of unmanned aerial vehicle (UAV) remote sensing technology is leading to an increased role for devices in the emergency search and rescue operations and the law enforcement tracking. The unique advantages of UAV include rapid response, wide-area coverage, and panoramic view. The utilization of UAV imagery for person detection is of paramount importance in the context of emergency rescue and social security. However, due to the small and complex background of person in UAV imagery, the high-precision person detection task still presents a significant challenge.
Methods To address the aforementioned issue, a person detection network for UAV imagery is proposed. First, the proposed network employs a spatial depth-transformed convolution in lieu of a down sampling layer in the feature extraction stage. This approach ensures the retention of all information in the channel dimensions, preventing information loss and facilitating the preservation of small object features throughout the feature extraction process. Then, a selective feature fusion module is designed to address the issue of noise interference. This module employs high-level features as weights to guide the crucial information in low-level features, thereby mitigating the impact of background noise on the latter. Furthermore, it facilitates a more comprehensive integration of the semantic information of high-level features and the detailed information of low-level features. This integration significantly enhances the capacity of network to distinguish between background and foreground information in UAV images. Finally, a context-aware module is designed to address the issue of limited feature information associated with small objects. This module integrates environmental information surrounding the object, including local features, contextual information, and global context, thus enhancing the contextual features of small objects and improving the final object detection accuracy.
Results To verify the effectiveness of the proposed method, case experiment analysis is conducted on public datasets. The proposed method achieves mean average precision (mAP) of 68.9%, with precision of 75.5% and recall of 67.7%. Compared to the conventional object detection algorithms, the proposed method shows improvements ranging from 3.1% to 29.5% in mAP, from 0.9% to 8.6% in precision, and from 1.0% to 57.9% in recall. The results indicate that the proposed method exhibits a high level of accuracy. Additionally, the generalization and applicability are confirmed through testing on UAV images in various scenarios and under different weather conditions.
Conclusions The proposed method can effectively solve the challenge of detecting small persons in UAV images and significantly improve the detection accuracy. It has a wide range of application prospects in the fields of emergency rescue and social security, and possesses good generalization ability for UAV image detection tasks in different environments.