Parallel Buffer Analysis of Large Scale Point Features Based on Graph Partitioning
-
摘要: 缓冲分析是解决邻近度问题的基础工具,由于算法本身包含大量的复杂运算,处理效率亟待优化。针对大规模点要素的缓冲分析,引入图表达建立了面向数据和分析过程的空间计算域,通过图划分实现了任务的均衡分割。图式化的空间计算域首先从图节点和图边两个角度定义了点要素及其空间关系的处理函数,然后对相应的时间复杂度进行拟合,获取了图节点和图边的计算权重,最后利用图划分方法实现了缓冲分析的均衡分割,从而构建与计算资源相匹配的并行任务。实验结果表明,基于图划分实现的并行缓冲分析方法在负载均衡性和整体性能方面优于主流的四叉树和规则格网划分方法,可为大规模矢量数据的空间分析优化提供参考。Abstract:Objectives Buffer analysis is a common tool of spatial analysis, which deals with the problem of proximity. Due to numerous and complex operations in the algorithm, the computational efficiency needs to be optimized.Methods To process large scale point features, a graph-based representation model is proposed, which establishes the spatial computational domain for data and analysis, and develops a well-balanced task-partitioning method by partitioning the graph. First, the proposed model defines processing functions of point features and their spatial relationships from the perspectives of graph nodes and graph edges, and provides a logic description for buffer zone generation around point features. Second, the computational weights of graph nodes and graph edges are obtained by fitting the time complexity of the above processing functions. Finally, graph partitioning is adopted to divide the buffer task, which contributes to multiple parallel tasks matching with the computational resources.Results The experimental results show that graph-based buffer analysis can achieve better load balance and overall efficiency, which is superior to the mainstream partitioning methods, regular-grid and quadtree.Conclusions The proposed method can provide a reference for optimization of spatial analysis methods when processing large scale vector data.
-
Keywords:
- vector data /
- buffer analysis /
- spatial computational domain /
- graph partitioning
-
-
表 1 多种半径下串行BGR与并行BGR计算时间
Table 1 Computation Time of Serial BGR and Parallel BGR with Different Radii
序号 缓冲半径/m 结果多边形数量 融合比例/% 串行时间/s 并行时间/s 加速比 1 50 672 151 1.29 106.47 81.65 1.30 2 100 660 333 3.03 109.74 92.01 1.19 3 150 638 495 6.23 116.10 103.21 1.12 4 200 610 145 10.40 127.32 112.67 1.13 5 250 579 401 14.91 140.21 94.98 1.48 6 300 548 306 19.48 152.51 98.84 1.54 7 350 516 911 24.09 165.01 102.51 1.61 8 400 484 913 28.79 179.45 112.78 1.59 9 450 452 125 33.60 194.92 118.52 1.64 10 500 419 060 38.46 209.65 120.52 1.74 11 550 385 672 43.36 233.66 127.92 1.83 12 600 352 021 48.30 289.91 138.79 2.09 13 650 319 934 53.02 392.93 148.13 2.65 14 700 289 453 57.49 1 038.99 296.12 3.51 15 750 261 232 61.64 2 925.87 409.83 7.14 16 800 235 772 65.38 6 883.81 576.84 11.93 17 850 212 723 68.76 18 782.26 1 082.78 17.35 18 900 192 058 71.80 26 595.51 1 392.72 19.10 19 950 173 827 74.47 31 561.23 1 541.99 20.47 20 1 000 157 414 76.88 36 191.83 1 518.41 23.84 表 2 BGR高强度运算测试
Table 2 Intensive Computation Test of BGR
序号 缓冲半径/m 串行时间/s 并行时间/s 加速比 1 10 000 54 516.13 2 391.54 22.80 2 20 000 56 593.58 2 496.18 22.67 3 40 000 133 143.17 5 735.34 23.21 -
[1] 李清泉, 李德仁. 大数据GIS[J]. 武汉大学学报(信息科学版), 2014, 39(6): 641-644. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH201406005.htm Li Qingquan, Li Deren. Big Data GIS[J]. Geomatics and Information Science of Wuhan University, 2014, 39(6): 641-644. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH201406005.htm
[2] 刘纪平, 董春, 亢晓琛, 等. 大数据时代的地理国情统计分析[J]. 武汉大学学报(信息科学版), 2019, 44(1): 68-76. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH201901008.htm Liu Jiping, Dong Chun, Kang Xiaochen, et al. National Geographical Conditions Statistical Analysis in the Era of Big Data[J]. Geomatics and Information Science of Wuhan University, 2019, 44(1): 68-76. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH201901008.htm
[3] 赵春宇. 高性能并行GIS中矢量空间数据存取与处理关键技术研究[D]. 武汉: 武汉大学, 2006. Zhao Chunyu. Studying on the Technolgies of Storage and Processing of Spatial Vector Data in High-performance Parallel GIS[D]. Wuhan: Wuhan University, 2006.
[4] Nekola J C, White P S. The Distance Decay of Similarity in Biogeography and Ecology[J]. Journal of Biogeography, 1999, 26(4): 867-878. doi: 10.1046/j.1365-2699.1999.00305.x
[5] Egenhofer M J. A Formal Definition of Binary Topological Relationships[M]//Foundations of Data Organization and Algorithms. Berlin, Heidelberg: Springer, 1989: 457-472.
[6] 郭仁忠, 陈业滨, 赵志刚, 等. ICT时代地图的科学概念及表达框架[J]. 武汉大学学报(信息科学版), 2022, 47(12): 1978-1987. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH202212002.htm Guo Renzhong, Chen Yebin, Zhao Zhigang, et al. Scientific Concept and Representation Framework of Maps in the ICT Era[J]. Geomatics and Information Science of Wuhan University, 2022, 47(12): 1978-1987. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH202212002.htm
[7] 黄良珂, 李琛, 谢劭峰, 等. 顾及垂直递减率的中国区域Tm格点产品空间插值[J]. 武汉大学学报(信息科学版), 2023, 48(2): 295-300. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH202302013.htm Huang Liangke, Li Chen, Xie Shaofeng, et al. Spatial Interpolation of Atmospheric Weighted Mean Temperature Grid Products in China with Conside-ration of Vertical Lapse Rate[J]. Geomatics and Information Science of Wuhan University, 2023, 48(2): 295-300. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH202302013.htm
[8] 应申, 靳凤攒, 李霖, 等. 基于ArcGIS Engine的矢量数据分层分块技术研究[J]. 测绘地理信息, 2014, 39(6): 50-53. https://www.cnki.com.cn/Article/CJFDTOTAL-DBCH202305057.htm Ying Shen, Jin Fengzan, Li Lin, et al. Hierarchical Block of Vector Data Based on ArcGIS Engine[J]. Journal of Geomatics, 2014, 39(6): 50-53. https://www.cnki.com.cn/Article/CJFDTOTAL-DBCH202305057.htm
[9] Longley P A, Tobón C. Spatial Dependence and Heterogeneity in Patterns of Hardship: An Intra-urban Analysis[J]. Annals of the Association of American Geographers, 2004, 94(3): 503-519. doi: 10.1111/j.1467-8306.2004.00411.x
[10] Rodríguez A M, Egenhofer M J. Comparing Geospatial Entity Classes: An Asymmetric and Context-dependent Similarity Measure[J]. International Journal of Geographical Information Science, 2004, 18(3): 229-256. doi: 10.1080/13658810310001629592
[11] Yang C W, Wu H Y, Huang Q Y, et al. Using Spatial Principles to Optimize Distributed Computing for Enabling the Physical Science Discoveries[J]. Proceedings of the National Academy of Sciences of the United States of America, 2011, 108(14): 5498-5503. doi: 10.1073/pnas.0909315108
[12] Werner M. Parallel Processing Strategies for Big Geospatial Data[J]. Frontiers in Big Data, 2019, 2: 44. doi: 10.3389/fdata.2019.00044
[13] Wang S W. A CyberGIS Framework for the Synthesis of Cyberinfrastructure, GIS, and Spatial Analysis[J]. Annals of the Association of American Geographers, 2010, 100(3): 535-557. doi: 10.1080/00045601003791243
[14] Zhang J T, You S M. CudaGIS: Report on the Design and Realization of a Massive Data Parallel GIS on GPUs[C]//The 3rd ACM SIGSPATIAL International Workshop on GeoStreaming, Redondo Beach, USA, 2012.
[15] Aji A, Wang F S, Vo H, et al. HadoopGIS: A High Performance Spatial Data Warehousing System over MapReduce[J]. Proceedings of the VLDB Endowment International Conference on Very Large Data Bases, 2013, 6(11): 1009.
[16] Kang X C, Liu J P, Lin X G. Streaming Progressive TIN Densification Filter for Airborne LiDAR Point Clouds Using Multi-core Architectures[J]. Remote Sensing, 2014, 6(8): 7212-7232.
[17] 郭明强, 谢忠, 黄颖. 集群并发环境下大规模矢量数据负载均衡算法[J]. 武汉大学学报(信息科学版), 2013, 38(9): 1131-1134. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH201309027.htm Guo Mingqiang, Xie Zhong, Huang Ying. Content Grid Load Balancing Algorithm for Large-scale Vector Data in the Server Cluster Concurrent Environment[J]. Geomatics and Information Science of Wuhan University, 2013, 38(9): 1131-1134. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH201309027.htm
[18] Shen J X, Chen L, Wu Y, et al. Approach to Acce-lerating Dissolved Vector Buffer Generation in Distributed In-memory Cluster Architecture[J]. ISPRS International Journal of Geo-Information, 2018, 7(1): 26.
[19] Fan J F, Ji M, Gu G M, et al. Optimization Approaches to Mpi and Area Merging-based Parallel Buffer Algorithm[J]. Boletim De Ciências Geodésicas, 2014, 20(2): 237-256.
[20] Wu Y, Ma M Y, Chen L. HiVewshed: An Interactive Online Viewshed Analysis System for Multiple Observers[C]//The 3rd International Conference on Computer Science and Application Engineering, Sanya, China, 2019.
[21] Guo M Q, Han C D, Guan Q F, et al. A Universal Parallel Scheduling Approach to Polyline and Polygon Vector Data Buffer Analysis on Conventional GIS Platforms[J]. Transactions in GIS, 2020, 24(6): 1630-1654.
[22] Hennessy J L, Patterson D A, Arpaci-Dusseau A C. A Quantitative Approach[M]. Boston: Morgan Kaufmann, 2011.
[23] Bernstein A J. Analysis of Programs for Parallel Processing[J]. IEEE Transactions on Electronic Computers, 1966, EC-15(5): 757-763.
[24] Wang S W, Armstrong M P. A Theoretical Approach to the Use of Cyberinfrastructure in Geographical Analysis[J]. International Journal of Geographical Information Science, 2009, 23(2): 169-193.
[25] Lipton R J, Tarjan R E. A Separator Theorem for Planar Graphs[J]. SIAM Journal on Applied Mathematics, 1979, 36(2): 177-189.
[26] Skiena S S. The Algorithm Design Manual[M]. London: Springer, 2008.
[27] 许宝刚. 图的划分: 一些进展与未解决问题[J]. 数学进展, 2016, 45(1): 1-20. https://www.cnki.com.cn/Article/CJFDTOTAL-SXJZ201601001.htm Xu Baogang. Graph Partitions: Recent Progresses and some Open Problems[J]. Advances in Mathematics, 2016, 45(1): 1-20. https://www.cnki.com.cn/Article/CJFDTOTAL-SXJZ201601001.htm
[28] Wieling M, Nerbonne J. Bipartite Spectral Graph Partitioning for Clustering Dialect Varieties and Detecting their Linguistic Features[J]. Computer Speech & Language, 2011, 25(3): 700-715.
[29] Plimpton S J, Moore S G, Borner A, et al. Direct Simulation Monte Carlo on Petaflop Supercomputers and Beyond[J]. Physics of Fluids, 2019, 31(8): 086101.
[30] Stanton I, Kliot G. Streaming Graph Partitioning for Large Distributed Graphs[C]// The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, USA, 2012.
[31] 王鑫, 陈蔚雪, 杨雅君, 等. 知识图谱划分算法研究综述[J]. 计算机学报, 2021, 44(1): 235-260. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJX202101012.htm Wang Xin, Chen Weixue, Yang Yajun, et al. Research on Knowledge Graph Partitioning Algorithms: A Survey[J]. Chinese Journal of Computers, 2021, 44(1): 235-260. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJX202101012.htm
-
期刊类型引用(4)
1. 朱杰,郑加柱,陈红华,杨静,胡平昌,陆敏燕. 结合POI数据的南京市商业中心识别与集聚特征研究. 现代测绘. 2022(06): 34-39 . 百度学术
2. 金澄,安晓亚,陈占龙,马啸川. 矢量居民地多边形多级图划分聚类方法. 武汉大学学报(信息科学版). 2021(01): 19-29 . 百度学术
3. 张铭龙,何贞铭. 基于因子分析法的城市商业中心抽取研究. 地理空间信息. 2021(08): 58-60+64+5 . 百度学术
4. 李卫东,张铭龙,段金龙. 基于POI数据的南京市空间格局定量研究. 世界地理研究. 2020(02): 317-326 . 百度学术
其他类型引用(3)