李琳慧, 王宇, 刘越岩, 李磊, 黄锦丞, 周熠, 曹淞霖. 基于特征相似度的不动产多源异构数据快速融合模型[J]. 武汉大学学报 ( 信息科学版). DOI: 10.13203/j.whugis20220742
引用本文: 李琳慧, 王宇, 刘越岩, 李磊, 黄锦丞, 周熠, 曹淞霖. 基于特征相似度的不动产多源异构数据快速融合模型[J]. 武汉大学学报 ( 信息科学版). DOI: 10.13203/j.whugis20220742
Li Linhui, Wang Yu, Liu Yueyan, Li Lei, Huang Jincheng, Zhou Yi, Cao Songlin. A Fast Fusion Model for Multi-Source Heterogeneous Data Of Real Estate Based on Feature Similarity[J]. Geomatics and Information Science of Wuhan University. DOI: 10.13203/j.whugis20220742
Citation: Li Linhui, Wang Yu, Liu Yueyan, Li Lei, Huang Jincheng, Zhou Yi, Cao Songlin. A Fast Fusion Model for Multi-Source Heterogeneous Data Of Real Estate Based on Feature Similarity[J]. Geomatics and Information Science of Wuhan University. DOI: 10.13203/j.whugis20220742

基于特征相似度的不动产多源异构数据快速融合模型

A Fast Fusion Model for Multi-Source Heterogeneous Data Of Real Estate Based on Feature Similarity

  • 摘要: 源异构数据快速融合是不动产统一登记过程中数据整合与建库面临的难点问题。现有不动产多源异构数据融合方法无法准确构建描述空间实体间的关联关系,且对数据规模庞大的不动产数据,存在时间成本过高的问题。通过引入特征相似度技术,本文提出了一种多层级快速融合模型,实现不动产多源异构数据的批量融合。首先,在对不动产多源异构数据进行电子化和矢量化的基础上,引入相似度因子对多源异构数据进行评估;然后,设计了一种基于综合相似度加权算法的网络模型,计算不动产多源异构数据在各相似度因子分布方向上的综合相似度;最后,采用相似度阈值参数和限定范围参数进一步提高多源异构数据融合的精度和效率。本文以不动产数据为例对快速融合模型进行定量化分析,实验结果表明,相比于其他方法,本文模型所耗时间成本低,每千幢的平均融合时间为 3.29s,融合精度达到 93.5%,能有效提升不动产多源异构数据的融合精度与效率。

     

    Abstract: Objectives: Fast fusion of multi-source heterogeneous data of real estate is a difficult problem faced by data integration and database construction in the process of unified real estate registration. The existing multi-source heterogeneous data fusion methods of real estate are unable to accurately construct the relationship describing the spatial entities, and the time cost is too high for the real estate data with huge data scale. Methods: By introducing feature similarity technology, this paper proposes a multi-level fast fusion model to realize batch fusion of real estate multi-source heterogeneous data. Firstly, on the basis of electronization and vectorization of real estate multi-source heterogeneous data, similarity factor is introduced to evaluate real estate multi-source heterogeneous data. Then, a network model based on comprehensive similarity weighting algorithm was designed to calculate the comprehensive similarity of real estate multi-source heterogeneous data in the distribution direction of each similarity factor. Finally, similarity threshold parameters and limited range parameters are used to further improve the accuracy and efficiency of multi-source heterogeneous data fusion. Results: This paper takes real estate data as an example to conduct quantitative analysis of the fast fusion model. The experimental results show that, compared with other methods, the proposed model consumes less time and costs, the average fusion time per thousand houses is 3.29s, and the fusion accuracy reaches 93.5%, which can effectively improve the fusion accuracy and efficiency of multi-source heterogeneous data of real estate.

     

/

返回文章
返回