一种基于流形学习的空间数据划分方法
Spatial Data Partitioning Method Based on Manifold Learning
-
摘要: 空间数据划分是空间数据库系统进行高效空间连接操作的前提和基础。针对现有的空间数据划分方法难以保持低冗余度和高数据量均衡度以及高效支持空间连接的问题,提出了一种基于流形学习的空间数据划分算法。利用流形学习保留降维前源数据结构不变的特点,构建数据划分策略和映射方法,通过将邻近数据划分到同一数据块来减少数据冗余度,通过对最小数据块进行映射,提高整体的数据量均衡度。实验表明,本文提出的划分方法具有极低的数据冗余度和良好的数据量均衡度。Abstract: Spatial data partitioning is a prerequisite for high efficient spatial joins within spatial database systems. Low data redundancy and high data balance rates are difficult to maintain however, using existing spatial data partitioning methods. We propose a spatial data partitioning algorithm based on manifold learning. Manifold learning can retain the structures of source data to construct a data partitioning strategy and mapping method before dimensionality reduction. Assigning neighboring objects to the same data block reduces data redundancy while mapping objects to the smallest data block adds data balance. Experiments show that spatial data partitioning based on manifold learning can reduce the data redundancy rate to very low level with good data balance.