Parallel Algorithm for Partitioning Massive Spatial Vector Data in Cloud Environment

YAO Xiaochuang; YANG Jianyu; LI Lin; YE Sijing; YUN Wenju; ZHU Dehai

doi:10.13203/j.whugis20160271

YAO Xiaochuang, YANG Jianyu, LI Lin, YE Sijing, YUN Wenju, ZHU Dehai. Parallel Algorithm for Partitioning Massive Spatial Vector Data in Cloud EnvironmentJ. Geomatics and Information Science of Wuhan University, 2018, 43(7): 1092-1097. DOI: 10.13203/j.whugis20160271

Citation:

Parallel Algorithm for Partitioning Massive Spatial Vector Data in Cloud Environment

Abstract

Abstract

Spatial data partitioning plays an important role in the spatial index methods and the data storage strategy for spatial big data. In this paper, to make up the inherent shortcomings of spatial data partitioning and data storage in the Hadoop cloud computing platform, a parallel algorithm based on Hilbert space-filling curve is presented for partitioning the massive spatial vector data. In the spatial vector data partitioning phase, we take more influence factors, including the spatial location relationship between adjacent objects, the size of spatial vector object itself, the number of spatial objects in the same spatial coded block and others, into full consideration. Meanwhile, by following the partitioning principle of merging small coded blocks and sub-splitting large coded blocks, this paper implements the parallel algorithm for partitioning the massive spatial vector data in cloud environment. Experimental results show that the algorithm proposed in this paper can not only improve the efficiency of the spatial R-tree index for massive spatial vector data, but also give a good data balance in Hadoop distributed file system (HDFS).

FullText(HTML)

References (13)

Cited By

Parallel Algorithm for Partitioning Massive Spatial Vector Data in Cloud Environment

Abstract

Catalog

Export File

Citation

Format

Content