Abstract:
High resolution remote sensing images have complicated content and abundant detail information. Large semantic gaps will occur as such images are difficult to describe using traditional shallow features. This paper proposes a method using four different CNNs pre-trained on ImageNet to in remote sensing image retrieval. High-level features are extracted from different layers of four CNNs. A Gaussian normalization method is adopted to normalize high-level features, and Euclidean distance is used as the similarity measurement. A serial of experiments carried on the UC-Merced and WHU-RS datasets show that CNN-M feature achieves the best retrieval performance with CNN features. Compared with the visual bag of words and global morphological texture descriptors, the mean average precision of CNN features was 15.7%-25.6% higher than that of shallow features. The average normalizedmodified retrieval rank of CNN features was 17%-22.1% lower than that of shallow features. Therefore the pre-trained convolutional neural network is effective for high-resolution remote sensing image retrieval.