特征相似度驱动的不动产多源异构数据快速融合模型

A Fast Fusion Model for Multi-source Heterogeneous Real Estate Data Driven by Feature Similarity

  • 摘要: 多源异构数据快速融合是不动产统一登记过程中数据整合与建库面临的难点问题。现有的不动产多源异构数据融合方法无法准确构建空间实体间的关联关系,且对数据规模庞大的不动产数据存在时间成本过高的问题。通过引入特征相似度技术,提出了一种多层级快速融合模型,实现不动产多源异构数据的批量融合。首先,在对不动产多源异构数据进行电子化和矢量化的基础上,引入相似度因子对多源异构数据进行评估;然后,设计了一种基于综合相似度加权算法的网络模型,计算不动产多源异构数据在各相似度因子分布方向上的综合相似度;最后,采用相似度阈值参数和限定范围参数进一步提高多源异构数据融合的精度和效率。以不动产数据为例,对快速融合模型进行定量化分析,实验结果表明,相比于其他方法,所提模型时间成本低,每千幢的平均融合时间约为1 s,融合准确率达到93.50%,能有效提升不动产多源异构数据的融合精度与效率。

     

    Abstract:
    Objectives Fast fusion of multi-source heterogeneous data of real estate is a difficult problem faced by data integration and database construction in the process of unified real estate registration. The exi‑sting multi-source heterogeneous data fusion methods of real estate are unable to accurately construct the relationship describing the spatial entities, and the time cost is too high for the real estate data with huge data scale.
    Methods By introducing feature similarity technology, we propose a multi-level fast fusion model to realize batch fusion of real estate multi-source heterogeneous data. First, on the basis of electronization and vectorization of real estate multi-source heterogeneous data, similarity factor is introduced to evaluate real estate multi-source heterogeneous data. Then, a network model based on comprehensive similarity weighting algorithm is designed to calculate the comprehensive similarity of real estate multi-source heterogeneous data in the distribution direction of each similarity factor. Finally, similarity threshold parameters and limited range parameters are used to further improve the accuracy and efficiency of multi-source heterogeneous data fusion.
    Results and Conclusions We take real estate data as an example to conduct quantitative analysis of the fast fusion model. The experimental results show that compared with other methods, the proposed model consumes less time and costs, the average fusion time per thousand houses is 1 s, and the fusion accuracy reaches 93.50%, which can effectively improve the fusion accuracy and efficiency of multi-source heterogeneous data of real estate.

     

/

返回文章
返回