LÜ Yafei, XIONG Wei, ZHANG Xiaohan. A General Cross-Modal Correlation Learning Method for Remote Sensing[J]. Geomatics and Information Science of Wuhan University, 2022, 47(11): 1887-1895. DOI: 10.13203/j.whugis20200213
Citation: LÜ Yafei, XIONG Wei, ZHANG Xiaohan. A General Cross-Modal Correlation Learning Method for Remote Sensing[J]. Geomatics and Information Science of Wuhan University, 2022, 47(11): 1887-1895. DOI: 10.13203/j.whugis20200213

A General Cross-Modal Correlation Learning Method for Remote Sensing

Funds: 

The National Natural Science Foundation of China 61790550

The National Natural Science Foundation of China 61790554

The National Natural Science Foundation of China 91538201

More Information
  • Author Bio:

    LÜ Yafei, PhD, engineer, specializes in remote sensing image processing, and cross-modal retrieval.E-mail: YFei_Lv@163.com

  • Corresponding author:

    XIONG Wei, PhD, professor. E-mail: xiongwei@csif.org.cn

  • Received Date: June 15, 2020
  • Available Online: November 15, 2022
  • Published Date: November 04, 2022
  •   Objectives  Aiming at the problem of inconsistent data distribution between cross-modal remote sensing information caused by "heterogeneity gap", a new cross-modal remote sensing dataset is constructed and released for public.
      Methods  To solve the problem of "heterogeneity gap", a general cross-modal correlation learning method (CCLM) is proposed for remote sensing. Based on the latent semantic consistency between different modality information, CCLM includes two stages: The learning of feature representation and the construction of common feature space. Firstly, deep neural networks are adopted to extract the feature representation of image and sequence information. To construct a common feature space, a new loss function is designed for correlation learning, by exploring the semantic consistency within intra-modality and complementary information contained in inter-modality. Secondly, knowledge distillation is used to enhance the semantic relevance to achieve the semantic consistency of common space.
      Results  The experiments are performed on our dataset. The experimental results show that the mean average precision (mAP) of our CCLM on cross-modal retrieval tasks exceeds 70%.
      Conclusions  The results outperform other baseline methods, and verify effectiveness of the proposed dataset and method.
  • [1]
    Chi Mingming, Plaza A, Benediktsson J A, et al. Big Data for Remote Sensing: Challenges and Opportunities[J]. Proceedings of the IEEE, 2016, 104(11): 2207-221 doi: 10.1109/JPROC.2016.2598228
    [2]
    Gudivada V N, Raghavan V V. Content-Based Image Retrieval Systems-Guest Editors' Introduction[J]. Computer, 1995, 28(9): 18–22 doi: 10.1109/2.410145
    [3]
    李峰, 曾志明, 付琨, 等. 遥感影像数据库基于内容检索系统的构建[J]. 武汉大学学报·信息科学版, 2005, 30(9): 787-790 doi: 10.13203/j.whugis2005.09.009

    Li Feng, Zeng Zhiming, Fu Kun, et al. Design of Content-Based Retrieval in Remote Sensing Image Database[J]. Geomatics and Information Science of Wuhan University, 2005, 30(9): 787-790 doi: 10.13203/j.whugis2005.09.009
    [4]
    Peng Yuxin, Huang Xin, Zhao Yunzheng. An Overview of Cross-Media Retrieval: Concepts, Methodologies, Benchmarks, and Challenges[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018, 28(9): 2372-2385
    [5]
    Yan Chenggang, Li Liang, Zhang Chunjie, et al. Cross-Modality Bridging and Knowledge Transferring for Image Understanding[J]. IEEE Transactions on Multimedia, 2019, 21(10): 2675-2685 doi: 10.1109/TMM.2019.2903448
    [6]
    Yan Fei, Mikolajczyk K. Deep Correlation for Matching Images and Text[C]//IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015
    [7]
    Wu Fei, Lu Xinyan, Song Jun. Learning of Multimodal Representations with Random Walks on the Click Graph[J]. IEEE Transactions on Image Processing, 2016, 25(2): 630–642 doi: 10.1109/TIP.2015.2507401
    [8]
    Hotelling H. Relations Between Two Sets of Variates[J]. Biometrika, 1936, 28(3): 321-377
    [9]
    Qi Jinwei, Peng Yuxin, Yuan Yuxin. Cross-Media Multi-level Alignment with Relation Attention Network[C]//The Twenty-Seventh International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 2018
    [10]
    Li Yansheng, Zhang Yongjun, Huang Xin, et al. Learning Source-Invariant Deep Hashing Convolutional Neural Networks for Cross-Source Remote Sensing Image Retrieval[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56 (11): 6521-6536
    [11]
    Ushasi C, Biplab B, Avik B. CMIR-NET: A Deep Learning Based Model for Cross-Modal Retrieval in Remote Sensing[J]. Pattern Recognition Letters, 2020, 131: 456-462
    [12]
    Guo Mao, Yuan Yuan, Lu Xiaoqiang. Deep Cross-Modal Retrieval for Remote Sensing Image and Audio[C]//The10th IAPR Workshop on Pattern Recognition in Remote Sensing, Beijing, China, 2018
    [13]
    卓昀侃, 綦金玮, 彭宇新. 跨媒体深层细粒度关联学习方法[J]. 软件学报, 2019, 30(4): 884–895 doi: 10.13328/j.cnki.jos.005664

    Zhuo Yunkan, Qi Jinwei, Peng Yuxin. Cross-Media Deep Fine-Grained Correlation Learning[J]. Journal of Software, 2019, 30(4): 884-895 doi: 10.13328/j.cnki.jos.005664
    [14]
    Cho K, Merrirnboer B V, Gulcehre C, et al. Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation[C]//The 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 2014
    [15]
    Frome A, Corrado G S, Shlens J, et al. Devise: A Deep Visual-Semantic Embedding Model[C]//The 26th International Conference on Neural Information Processing Systems, Curran Associates, USA, 2013
    [16]
    Wen Yandong, Zhang Kaipeng, Li Zhifeng. A Discriminative Feature Learning Approach for Deep Face Recognition[C]//European Conference Computer Vision, Amsterdam, the Netherlands, 2016
    [17]
    Hinton G E, Vinyals O, Dean J. Distilling the Knowledge in a Neural Network[J]. Computer Science, 2015, 14(7): 38-39
    [18]
    Lu Xiaoqiang, Wang Binqiang, Zheng Xiangtao. Exploring Models and Data for Remote Sensing Image Caption Generation[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(4): 2183-2195
    [19]
    Merkle N, Auer S, Muller R, et al. Exploring the Potential of Conditional Adversarial Networks for Optical and SAR Image Matching[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2018, 11(6): 1811-1820
    [20]
    Xiong Wei, Lü Yafei, Zhang Xiaohan, et al. Learning to Translate for Cross-Source Remote Sensing Image Retrieval[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 58(7): 4860-4874
    [21]
    Wang Yuanyuan, Zhu Xiaoxiang. The SARoptical Dataset for Joint Analysis of SAR and Optical Image in Dense Urban Area[C]//IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 2018
    [22]
    Kiros R, Salakhutdinov R, Zemel R S. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models[C]//Conference and Workshop on Neural Information Processing Systems, Montreal, Canada, 2014
    [23]
    Huang Xin, Peng Yuxin, Yuan Mingkuan. Cross-Modal Common Representation Learning by Hybrid Transfer Network[C]// The Twenty-Sixth International Joint Conference on Artificial Intelligence, Melbourne, Australia, 2017
    [24]
    He Xiangteng, Peng Yuxin, Xie Liu. A New Benchmark and Approach for Fine-Grained Cross-Media Retrieval[C]//The 27th ACM International Conference on Multimedia, New York, USA, 2019
  • Cited by

    Periodical cited type(37)

    1. 罗斌,刘文豪,吴进,韩嘉福,吴文周,李洪省. 从地理信息系统到地理智能体. 地球信息科学学报. 2025(01): 83-99 .
    2. 王培晓,张恒才,张岩,程诗奋,张彤,陆锋. 地理空间智能预测研究进展与发展趋势. 地球信息科学学报. 2025(01): 60-82 .
    3. 王行风,陈国良. 地理知识图谱辅助的煤矿区生态损伤智慧识别研究. 地球信息科学学报. 2025(02): 367-380 .
    4. 张岸,朱俊锴. 新一代人工智能驱动下地图学研究的机遇与挑战. 地球信息科学学报. 2024(01): 35-45 .
    5. 刘康. 人类移动数据生成方法:研究进展与趋势探讨. 地球信息科学学报. 2024(04): 831-847 .
    6. 闾国年,袁林旺,陈旻,张雪英,周良辰,俞肇元,罗文,乐松山,吴明光. 地理信息学科发展的思考. 地球信息科学学报. 2024(04): 767-778 .
    7. 吴田军,骆剑承,李曼嘉,张静,赵馨,胡晓东,左进,闵帆,王玲玉,黄启厅. 地理时空数字化底座理论框架构建与应用实践. 地球信息科学学报. 2024(04): 799-830 .
    8. 王宇君,郭健,徐立,李宗明,李可欣. 利用深度森林进行船舶类型分类识别. 测绘科学技术学报. 2024(04): 425-432+440 .
    9. 邬伦,侯远樵,刘瑜. 大数据的6种地理学应用范式. 测绘学报. 2024(08): 1465-1479 .
    10. 石岩,王达,邓敏,杨学习. 时空异常探测:从数据驱动到知识驱动的内涵转变与实现路径. 测绘学报. 2024(08): 1493-1504 .
    11. 杨开先,甄峰. 地理学视角下城市空间智能化研究进展与思考. 地理科学. 2024(07): 1166-1177 .
    12. 李洁,王继周,毛曦,路文娟. 基于语义编码的自然语言时空问句语义理解. 测绘科学. 2024(11): 197-206 .
    13. 刘瑜,汪珂丽,邢潇月,郭浩,张维昱,罗琴瑶,高松,黄舟,李海峰,李新,王姣娥,王劲峰,朱递. 地理分析中的空间效应. 地理学报. 2023(03): 517-531 .
    14. 杨颖. 人工智能在地图学中的应用展望. 电子技术. 2023(03): 162-163 .
    15. 诸云强,孙凯,胡修棉,闾海荣,王新兵,杨杰,王曙,李威蓉,宋佳,苏娜,牟兴林. 大规模地球科学知识图谱构建与共享应用框架研究与实践. 地球信息科学学报. 2023(06): 1215-1227 .
    16. 陆锋,诸云强,张雪英. 时空知识图谱研究进展与展望. 地球信息科学学报. 2023(06): 1091-1105 .
    17. 诸云强,孙凯,李威蓉,王曙,宋佳,程全英,杨杰,牟兴林,耿文广,代小亮. 地球科学知识图谱比较分析与启示:构建方法与内容视角. 高校地质学报. 2023(03): 382-394 .
    18. 张彤,刘仁宇,王培晓,高楚林,刘杰,王望舒. 感知物理先验的机器学习及其在地理空间智能中的研究前景. 地球信息科学学报. 2023(07): 1297-1311 .
    19. 丁建丽,葛翔宇,王瑾杰,赵爽,丁玥,秦少峰,朱传梅,马雯. 地理学领域的人工智能应用与思考. 新疆大学学报(自然科学版)(中英文). 2023(04): 385-397 .
    20. 付偲,李超岭,张海燕,刘畅,李丰丹. 基于多模态特征融合的地质体识别方法. 地球科学. 2023(10): 3743-3752 .
    21. Yunqiang ZHU,Kai SUN,Shu WANG,Chenghu ZHOU,Feng LU,Hairong LV,Qinjun QIU,Xinbing WANG,Yanmin QI. An adaptive representation model for geoscience knowledge graphs considering complex spatiotemporal features and relationships. Science China Earth Sciences. 2023(11): 2563-2578 .
    22. 诸云强,孙凯,王曙,周成虎,陆锋,闾海荣,邱芹军,王新兵,祁彦民. 顾及复杂时空特征及关系的地球科学知识图谱自适应表达模型. 中国科学:地球科学. 2023(11): 2609-2622 .
    23. 陈杰,邓敏,刘启亮,石岩,刘慧敏. 大数据智能时代地理信息科学专业人才培养方案的提质与实践. 测绘通报. 2023(11): 163-167 .
    24. 黄露,侯爱羚. 基于省市联动的基础测绘数据智能更新技术. 地理空间信息. 2023(12): 83-85 .
    25. 许磊,李琪,陶雲,余红楚,杜文英,陈泽强,陈能成. 数据驱动的短临降水预报可靠性分析技术体系研究. 时空信息学报. 2023(04): 508-517 .
    26. 贺智,陈逸敏,刘凯. AI时代地理信息科学一流本科专业课程建设探索. 测绘通报. 2023(S2): 60-63 .
    27. 杨学习,邓敏,刘瑜. 社会感知与地理空间智能的研究动态与展望——“社会感知与地理空间智能”专栏导读. 地理与地理信息科学. 2022(01): 1-4 .
    28. 李双成,张文彬,陈立英,梁泽,张雅娟,王铮. 孪生空间及其应用——兼论地理研究空间的重构. 地理学报. 2022(03): 507-517 .
    29. 孔宇,甄峰,张姗琪. 智能技术影响下的城市空间研究进展与思考. 地理科学进展. 2022(06): 1068-1081 .
    30. 刘瑜,郭浩,李海峰,董卫华,裴韬. 从地理规律到地理空间人工智能. 测绘学报. 2022(06): 1062-1069 .
    31. 涂伟,夏吉喆,汪驰升,陆旻,乐阳. 面向智慧城市的空间计算与分析类课程教学模式探索与实践. 测绘地理信息. 2022(S1): 14-17 .
    32. 高嘉良,陆锋,彭澎,徐阳. 基于网络文本迁移学习的旅游知识图谱构建. 武汉大学学报(信息科学版). 2022(08): 1191-1200+1219 .
    33. 武芳,杜佳威,钱海忠,翟仁健. 地图综合智能化研究的发展与思考. 武汉大学学报(信息科学版). 2022(10): 1675-1687 .
    34. 兰锐,陈慧玲,童杨辉. “自然资源大脑”构建关键技术及应用设想. 自然资源信息化. 2022(05): 99-105 .
    35. 慎利,徐柱,李志林,刘万增,崔秉良. 从地理信息服务到地理知识服务:基本问题与发展路径. 测绘学报. 2021(09): 1194-1202 .
    36. 张永生,张振超,童晓冲,纪松,于英,赖广陵. 地理空间智能研究进展和面临的若干挑战. 测绘学报. 2021(09): 1137-1146 .
    37. 艾廷华. 深度学习赋能地图制图的若干思考. 测绘学报. 2021(09): 1170-1182 .

    Other cited types(13)

Catalog

    Article views PDF downloads Cited by(50)
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return