吕亚飞, 熊伟, 张筱晗. 一种通用的跨模态遥感信息关联学习方法[J]. 武汉大学学报 ( 信息科学版), 2022, 47(11): 1887-1895. DOI: 10.13203/j.whugis20200213
引用本文: 吕亚飞, 熊伟, 张筱晗. 一种通用的跨模态遥感信息关联学习方法[J]. 武汉大学学报 ( 信息科学版), 2022, 47(11): 1887-1895. DOI: 10.13203/j.whugis20200213
LÜ Yafei, XIONG Wei, ZHANG Xiaohan. A General Cross-Modal Correlation Learning Method for Remote Sensing[J]. Geomatics and Information Science of Wuhan University, 2022, 47(11): 1887-1895. DOI: 10.13203/j.whugis20200213
Citation: LÜ Yafei, XIONG Wei, ZHANG Xiaohan. A General Cross-Modal Correlation Learning Method for Remote Sensing[J]. Geomatics and Information Science of Wuhan University, 2022, 47(11): 1887-1895. DOI: 10.13203/j.whugis20200213

一种通用的跨模态遥感信息关联学习方法

A General Cross-Modal Correlation Learning Method for Remote Sensing

  • 摘要: 针对“异质鸿沟”问题导致的不同模态遥感信息间相似性难以度量的问题,构建并公开了一个包含4种模态信息的跨模态遥感数据集,并基于不同模态信息间潜在的语义一致性,提出了一种通用的跨模态遥感信息关联学习方法。利用深度神经网络的表征能力,分别对图像类信息和序列类信息设计各模态信息的特征学习网络,实现对不同模态高层语义信息的准确表示;设计了一个新的关联学习损失函数对模态内的语义一致性和模态间的互补性进行限制,利用知识蒸馏的思想,以先融合后迁移各模态间信息的方式增强模态间的语义相关性。在构建的数据集上进行实验,结果表明,所提方法平均精度均值达到70%,超过基准方法。

     

    Abstract:
      Objectives  Aiming at the problem of inconsistent data distribution between cross-modal remote sensing information caused by "heterogeneity gap", a new cross-modal remote sensing dataset is constructed and released for public.
      Methods  To solve the problem of "heterogeneity gap", a general cross-modal correlation learning method (CCLM) is proposed for remote sensing. Based on the latent semantic consistency between different modality information, CCLM includes two stages: The learning of feature representation and the construction of common feature space. Firstly, deep neural networks are adopted to extract the feature representation of image and sequence information. To construct a common feature space, a new loss function is designed for correlation learning, by exploring the semantic consistency within intra-modality and complementary information contained in inter-modality. Secondly, knowledge distillation is used to enhance the semantic relevance to achieve the semantic consistency of common space.
      Results  The experiments are performed on our dataset. The experimental results show that the mean average precision (mAP) of our CCLM on cross-modal retrieval tasks exceeds 70%.
      Conclusions  The results outperform other baseline methods, and verify effectiveness of the proposed dataset and method.

     

/

返回文章
返回