利用模拟数据进行遥感图像目标检测模型训练

罗诗琦; 罗斌; 苏鑫; 张婧; 刘军

doi:10.13203/j.whugis20230149

利用模拟数据进行遥感图像目标检测模型训练

Leveraging Synthetic Data for Object Detection in Remote Sensing Images

摘要

摘要: 遥感图像目标检测技术应用广泛，但是训练数据获取受到天气、政策等限制且标注成本高昂。然而，由计算机渲染得到的模拟数据生成速度快且成本低。因此，本文提出将模拟数据应用于遥感图像目标检测模型训练。首先，基于GTA5提出了自动获取图像及其标注的模拟数据采集系统，快速构建大规模遥感图像目标检测模拟样本集；然后，通过构建循环生成对抗网络将模拟遥感图像风格迁移至真实遥感图像；最后，在真实数据集UCAS-AOD和NWPU VHR-10上评估了本文方法的有效性。结果表明，风格迁移减小了模拟数据与真实数据之间的域差异，将迁移后的模拟样本集用于预训练可提升检测精度，在Faster RCNN和YOLOv8上的平均精度最高提升了27.9%和18.5%。证明本文方法减小了遥感图像目标检测训练集的构建成本，可以为真实标注数据不足的场景提供有价值的解决方案。

Abstract: Objectives Object detection in remote sensing images is widely used, but the cost of training set labeling is high and image acquisition is limited by weather conditions and policies. However, synthetic data generated by computer rendering is fast and low-cost. Therefore, we propose a method for object detection in remote sensing images by utilizing synthetic data.Methods Firstly, a synthetic data collection system was developed based on GTA5(Grand Theft Auto V) to automatically obtain images and their annotations. Using this system, we construct a large-scale synthetic dataset named GTA5-Vehicle, which includes 29,657 instances of vehicles for remote sensing image object detection. Secondly, we transfer the synthetic image style to real image style by constructing a cycle-consistent adversarial network while preserving their content. Finally, the effectiveness of the proposed approach is evaluated on real datasets, namely UCAS-AOD and NWPU VHR-10. To validate the generalization capability of the approach, both Faster RCNN and YOLOv8 models are utilized for object detection experiments.Results The results demonstrate that style transfer reduces the domain discrepancy between synthetic and real data, and leveraging the transferred synthetic dataset for pre-training yields an enhancement in detection accuracy. Specifically, in the absence of real annotated data, the application of style transfer to the Faster RCNN model yields an average precision improvement of 8.7% across both real datasets. Training the YOLOv8 model on transferred synthetic dataset produces remarkable average precision scores, reaching 80.9% and 66.5% on the respective datasets with IoU set to 0.5. Moreover, when a limited amount of real annotated data is available, utilizing the transferred simulated sample set for pre-training enhances detection accuracy, resulting in the highest average precision improvements of 27.9% and 18.5% for Faster RCNN and YOLOv8, respectively.Conclusions The proposed method not only reduces the cost associated with constructing the training set for object detection in remote sensing images but also offers a valuable solution for settings with limited real annotated data. The synthetic data and code are available at: https://lsq210.github.io/GTAVDataCollection/ .

HTML全文

参考文献(0)

施引文献

资源附件(0)