Abstract:
Objective Decryption is the key to ensure the safe sharing of remote sensing resources. To solve the problems of incomplete target detection, unreliable complementary results, high resource consumption and difficulty of training in the traditional methods of sensitive target hiding in remote sensing images, an automatic hiding method of sensitive targets in remote sensing images is proposed based on the ability of Transformer structure to deal with global information.
Methods Firstly, the optimized Cascade Mask R-CNN instance segmentation model with Swin Transformer as the backbone network is used to detect sensitive targets and generate mask regions. After improving the generalization capability of the model, RSMosaic (remote sense Mosaic), a data synthesis method to reduce the dependence on manually labeled data is designed. Secondly, the mask region is expanded by using the shadow detection model based on HSV(hue-saturation-value) space, and the MAE(masked autoencoders) model is introduced to achieve target background generation. Finally, the generated images are spliced with the original images to obtain the decrypted images.
Results The sub-meter remote sensing images collected by Google Earth are used as test data, and the results show that this proposed method generates reliable hiding results while reducing dataset dependence and training resource consumption. Compared with the traditional method, the AP (average precision) values of bounding box and pixel mask are improved by 13.2% and 11.2% respectively in sensitive target instance segmentation, and the AP values can be improved by another 9.39% and 14.16% respectively after using RSMosaic, which is better than other repair models in terms of objective index and index variance in the field of image repair, especially in mean absolute error and maximum mean discrepancy indexes which are improved by more than 80%. It achieves the effect of automatic hiding of sensitive targets with reasonable structure and clear texture.
Conclusions The proposed method reduces manpower, data and computing resources, and achieves better results in both subjective visual effects and objective indexes, which can provide technical support for real remote sensing image sharing.