图划分支持下的大规模点要素并行缓冲分析方法

亢晓琛, 刘纪平

亢晓琛, 刘纪平. 图划分支持下的大规模点要素并行缓冲分析方法[J]. 武汉大学学报 ( 信息科学版), 2023, 48(6): 979-987. DOI: 10.13203/j.whugis20210011
引用本文: 亢晓琛, 刘纪平. 图划分支持下的大规模点要素并行缓冲分析方法[J]. 武汉大学学报 ( 信息科学版), 2023, 48(6): 979-987. DOI: 10.13203/j.whugis20210011
KANG Xiaochen, LIU Jiping. Parallel Buffer Analysis of Large Scale Point Features Based on Graph Partitioning[J]. Geomatics and Information Science of Wuhan University, 2023, 48(6): 979-987. DOI: 10.13203/j.whugis20210011
Citation: KANG Xiaochen, LIU Jiping. Parallel Buffer Analysis of Large Scale Point Features Based on Graph Partitioning[J]. Geomatics and Information Science of Wuhan University, 2023, 48(6): 979-987. DOI: 10.13203/j.whugis20210011

图划分支持下的大规模点要素并行缓冲分析方法

基金项目: 

国家自然科学基金 41701461

中国测绘科学研究院基本科研业务费项目 AR2001

详细信息
    作者简介:

    亢晓琛,博士,副研究员,主要从事高性能地理计算和地理国情监测理论与方法研究。kangxc@casm.ac.cn

  • 中图分类号: P208

Parallel Buffer Analysis of Large Scale Point Features Based on Graph Partitioning

  • 摘要: 缓冲分析是解决邻近度问题的基础工具,由于算法本身包含大量的复杂运算,处理效率亟待优化。针对大规模点要素的缓冲分析,引入图表达建立了面向数据和分析过程的空间计算域,通过图划分实现了任务的均衡分割。图式化的空间计算域首先从图节点和图边两个角度定义了点要素及其空间关系的处理函数,然后对相应的时间复杂度进行拟合,获取了图节点和图边的计算权重,最后利用图划分方法实现了缓冲分析的均衡分割,从而构建与计算资源相匹配的并行任务。实验结果表明,基于图划分实现的并行缓冲分析方法在负载均衡性和整体性能方面优于主流的四叉树和规则格网划分方法,可为大规模矢量数据的空间分析优化提供参考。
    Abstract:
      Objectives  Buffer analysis is a common tool of spatial analysis, which deals with the problem of proximity. Due to numerous and complex operations in the algorithm, the computational efficiency needs to be optimized.
      Methods  To process large scale point features, a graph-based representation model is proposed, which establishes the spatial computational domain for data and analysis, and develops a well-balanced task-partitioning method by partitioning the graph. First, the proposed model defines processing functions of point features and their spatial relationships from the perspectives of graph nodes and graph edges, and provides a logic description for buffer zone generation around point features. Second, the computational weights of graph nodes and graph edges are obtained by fitting the time complexity of the above processing functions. Finally, graph partitioning is adopted to divide the buffer task, which contributes to multiple parallel tasks matching with the computational resources.
      Results  The experimental results show that graph-based buffer analysis can achieve better load balance and overall efficiency, which is superior to the mainstream partitioning methods, regular-grid and quadtree.
      Conclusions  The proposed method can provide a reference for optimization of spatial analysis methods when processing large scale vector data.
  • 图  1   点要素距离依赖关系搜索

    Figure  1.   Searching for Distance Dependent Relationships of Point Features

    图  2   BGR实例化流程

    Figure  2.   Procedure for BGR Instantiation

    图  3   Metis加权图存储格式

    Figure  3.   Storage Format for Metis Weighted Graph

    图  4   规则格网划分与多级划分对比

    Figure  4.   Comparison of Regular-grid and Multi-level Partitioning

    图  5   基于BGR-Log的缓冲融合区并行生成

    Figure  5.   Parallel Dissolved Buffer Zone Generation Based on BGR-Log

    图  6   等间距点要素缓冲区融合的拟合函数时间

    Figure  6.   Fitting Function Time of Dissolved Buffer Zone Generation with Equidistant Point Features

    图  7   不同缓冲半径下3类空间划分方法时间对比

    Figure  7.   Partitioning Time Comparison of Three Methods with Different Radii

    图  8   不同划分方法局部效果图

    Figure  8.   Local Effects of Different Partitioning Methods

    图  9   空间划分均衡性与割边处理时间

    Figure  9.   Balance of Spatial Partitioning and Cut-Edge Processing Time

    表  1   多种半径下串行BGR与并行BGR计算时间

    Table  1   Computation Time of Serial BGR and Parallel BGR with Different Radii

    序号 缓冲半径/m 结果多边形数量 融合比例/% 串行时间/s 并行时间/s 加速比
    1 50 672 151 1.29 106.47 81.65 1.30
    2 100 660 333 3.03 109.74 92.01 1.19
    3 150 638 495 6.23 116.10 103.21 1.12
    4 200 610 145 10.40 127.32 112.67 1.13
    5 250 579 401 14.91 140.21 94.98 1.48
    6 300 548 306 19.48 152.51 98.84 1.54
    7 350 516 911 24.09 165.01 102.51 1.61
    8 400 484 913 28.79 179.45 112.78 1.59
    9 450 452 125 33.60 194.92 118.52 1.64
    10 500 419 060 38.46 209.65 120.52 1.74
    11 550 385 672 43.36 233.66 127.92 1.83
    12 600 352 021 48.30 289.91 138.79 2.09
    13 650 319 934 53.02 392.93 148.13 2.65
    14 700 289 453 57.49 1 038.99 296.12 3.51
    15 750 261 232 61.64 2 925.87 409.83 7.14
    16 800 235 772 65.38 6 883.81 576.84 11.93
    17 850 212 723 68.76 18 782.26 1 082.78 17.35
    18 900 192 058 71.80 26 595.51 1 392.72 19.10
    19 950 173 827 74.47 31 561.23 1 541.99 20.47
    20 1 000 157 414 76.88 36 191.83 1 518.41 23.84
    下载: 导出CSV

    表  2   BGR高强度运算测试

    Table  2   Intensive Computation Test of BGR

    序号 缓冲半径/m 串行时间/s 并行时间/s 加速比
    1 10 000 54 516.13 2 391.54 22.80
    2 20 000 56 593.58 2 496.18 22.67
    3 40 000 133 143.17 5 735.34 23.21
    下载: 导出CSV
  • [1] 李清泉, 李德仁. 大数据GIS[J]. 武汉大学学报(信息科学版), 2014, 39(6): 641-644. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH201406005.htm

    Li Qingquan, Li Deren. Big Data GIS[J]. Geomatics and Information Science of Wuhan University, 2014, 39(6): 641-644. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH201406005.htm

    [2] 刘纪平, 董春, 亢晓琛, 等. 大数据时代的地理国情统计分析[J]. 武汉大学学报(信息科学版), 2019, 44(1): 68-76. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH201901008.htm

    Liu Jiping, Dong Chun, Kang Xiaochen, et al. National Geographical Conditions Statistical Analysis in the Era of Big Data[J]. Geomatics and Information Science of Wuhan University, 2019, 44(1): 68-76. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH201901008.htm

    [3] 赵春宇. 高性能并行GIS中矢量空间数据存取与处理关键技术研究[D]. 武汉: 武汉大学, 2006.

    Zhao Chunyu. Studying on the Technolgies of Storage and Processing of Spatial Vector Data in High-performance Parallel GIS[D]. Wuhan: Wuhan University, 2006.

    [4]

    Nekola J C, White P S. The Distance Decay of Similarity in Biogeography and Ecology[J]. Journal of Biogeography, 1999, 26(4): 867-878. doi: 10.1046/j.1365-2699.1999.00305.x

    [5]

    Egenhofer M J. A Formal Definition of Binary Topological Relationships[M]//Foundations of Data Organization and Algorithms. Berlin, Heidelberg: Springer, 1989: 457-472.

    [6] 郭仁忠, 陈业滨, 赵志刚, 等. ICT时代地图的科学概念及表达框架[J]. 武汉大学学报(信息科学版), 2022, 47(12): 1978-1987. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH202212002.htm

    Guo Renzhong, Chen Yebin, Zhao Zhigang, et al. Scientific Concept and Representation Framework of Maps in the ICT Era[J]. Geomatics and Information Science of Wuhan University, 2022, 47(12): 1978-1987. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH202212002.htm

    [7] 黄良珂, 李琛, 谢劭峰, 等. 顾及垂直递减率的中国区域Tm格点产品空间插值[J]. 武汉大学学报(信息科学版), 2023, 48(2): 295-300. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH202302013.htm

    Huang Liangke, Li Chen, Xie Shaofeng, et al. Spatial Interpolation of Atmospheric Weighted Mean Temperature Grid Products in China with Conside-ration of Vertical Lapse Rate[J]. Geomatics and Information Science of Wuhan University, 2023, 48(2): 295-300. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH202302013.htm

    [8] 应申, 靳凤攒, 李霖, 等. 基于ArcGIS Engine的矢量数据分层分块技术研究[J]. 测绘地理信息, 2014, 39(6): 50-53. https://www.cnki.com.cn/Article/CJFDTOTAL-DBCH202305057.htm

    Ying Shen, Jin Fengzan, Li Lin, et al. Hierarchical Block of Vector Data Based on ArcGIS Engine[J]. Journal of Geomatics, 2014, 39(6): 50-53. https://www.cnki.com.cn/Article/CJFDTOTAL-DBCH202305057.htm

    [9]

    Longley P A, Tobón C. Spatial Dependence and Heterogeneity in Patterns of Hardship: An Intra-urban Analysis[J]. Annals of the Association of American Geographers, 2004, 94(3): 503-519. doi: 10.1111/j.1467-8306.2004.00411.x

    [10]

    Rodríguez A M, Egenhofer M J. Comparing Geospatial Entity Classes: An Asymmetric and Context-dependent Similarity Measure[J]. International Journal of Geographical Information Science, 2004, 18(3): 229-256. doi: 10.1080/13658810310001629592

    [11]

    Yang C W, Wu H Y, Huang Q Y, et al. Using Spatial Principles to Optimize Distributed Computing for Enabling the Physical Science Discoveries[J]. Proceedings of the National Academy of Sciences of the United States of America, 2011, 108(14): 5498-5503. doi: 10.1073/pnas.0909315108

    [12]

    Werner M. Parallel Processing Strategies for Big Geospatial Data[J]. Frontiers in Big Data, 2019, 2: 44. doi: 10.3389/fdata.2019.00044

    [13]

    Wang S W. A CyberGIS Framework for the Synthesis of Cyberinfrastructure, GIS, and Spatial Analysis[J]. Annals of the Association of American Geographers, 2010, 100(3): 535-557. doi: 10.1080/00045601003791243

    [14]

    Zhang J T, You S M. CudaGIS: Report on the Design and Realization of a Massive Data Parallel GIS on GPUs[C]//The 3rd ACM SIGSPATIAL International Workshop on GeoStreaming, Redondo Beach, USA, 2012.

    [15]

    Aji A, Wang F S, Vo H, et al. HadoopGIS: A High Performance Spatial Data Warehousing System over MapReduce[J]. Proceedings of the VLDB Endowment International Conference on Very Large Data Bases, 2013, 6(11): 1009.

    [16]

    Kang X C, Liu J P, Lin X G. Streaming Progressive TIN Densification Filter for Airborne LiDAR Point Clouds Using Multi-core Architectures[J]. Remote Sensing, 2014, 6(8): 7212-7232.

    [17] 郭明强, 谢忠, 黄颖. 集群并发环境下大规模矢量数据负载均衡算法[J]. 武汉大学学报(信息科学版), 2013, 38(9): 1131-1134. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH201309027.htm

    Guo Mingqiang, Xie Zhong, Huang Ying. Content Grid Load Balancing Algorithm for Large-scale Vector Data in the Server Cluster Concurrent Environment[J]. Geomatics and Information Science of Wuhan University, 2013, 38(9): 1131-1134. https://www.cnki.com.cn/Article/CJFDTOTAL-WHCH201309027.htm

    [18]

    Shen J X, Chen L, Wu Y, et al. Approach to Acce-lerating Dissolved Vector Buffer Generation in Distributed In-memory Cluster Architecture[J]. ISPRS International Journal of Geo-Information, 2018, 7(1): 26.

    [19]

    Fan J F, Ji M, Gu G M, et al. Optimization Approaches to Mpi and Area Merging-based Parallel Buffer Algorithm[J]. Boletim De Ciências Geodésicas, 2014, 20(2): 237-256.

    [20]

    Wu Y, Ma M Y, Chen L. HiVewshed: An Interactive Online Viewshed Analysis System for Multiple Observers[C]//The 3rd International Conference on Computer Science and Application Engineering, Sanya, China, 2019.

    [21]

    Guo M Q, Han C D, Guan Q F, et al. A Universal Parallel Scheduling Approach to Polyline and Polygon Vector Data Buffer Analysis on Conventional GIS Platforms[J]. Transactions in GIS, 2020, 24(6): 1630-1654.

    [22]

    Hennessy J L, Patterson D A, Arpaci-Dusseau A C. A Quantitative Approach[M]. Boston: Morgan Kaufmann, 2011.

    [23]

    Bernstein A J. Analysis of Programs for Parallel Processing[J]. IEEE Transactions on Electronic Computers, 1966, EC-15(5): 757-763.

    [24]

    Wang S W, Armstrong M P. A Theoretical Approach to the Use of Cyberinfrastructure in Geographical Analysis[J]. International Journal of Geographical Information Science, 2009, 23(2): 169-193.

    [25]

    Lipton R J, Tarjan R E. A Separator Theorem for Planar Graphs[J]. SIAM Journal on Applied Mathematics, 1979, 36(2): 177-189.

    [26]

    Skiena S S. The Algorithm Design Manual[M]. London: Springer, 2008.

    [27] 许宝刚. 图的划分: 一些进展与未解决问题[J]. 数学进展, 2016, 45(1): 1-20. https://www.cnki.com.cn/Article/CJFDTOTAL-SXJZ201601001.htm

    Xu Baogang. Graph Partitions: Recent Progresses and some Open Problems[J]. Advances in Mathematics, 2016, 45(1): 1-20. https://www.cnki.com.cn/Article/CJFDTOTAL-SXJZ201601001.htm

    [28]

    Wieling M, Nerbonne J. Bipartite Spectral Graph Partitioning for Clustering Dialect Varieties and Detecting their Linguistic Features[J]. Computer Speech & Language, 2011, 25(3): 700-715.

    [29]

    Plimpton S J, Moore S G, Borner A, et al. Direct Simulation Monte Carlo on Petaflop Supercomputers and Beyond[J]. Physics of Fluids, 2019, 31(8): 086101.

    [30]

    Stanton I, Kliot G. Streaming Graph Partitioning for Large Distributed Graphs[C]// The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, USA, 2012.

    [31] 王鑫, 陈蔚雪, 杨雅君, 等. 知识图谱划分算法研究综述[J]. 计算机学报, 2021, 44(1): 235-260. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJX202101012.htm

    Wang Xin, Chen Weixue, Yang Yajun, et al. Research on Knowledge Graph Partitioning Algorithms: A Survey[J]. Chinese Journal of Computers, 2021, 44(1): 235-260. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJX202101012.htm

  • 期刊类型引用(4)

    1. 朱杰,郑加柱,陈红华,杨静,胡平昌,陆敏燕. 结合POI数据的南京市商业中心识别与集聚特征研究. 现代测绘. 2022(06): 34-39 . 百度学术
    2. 金澄,安晓亚,陈占龙,马啸川. 矢量居民地多边形多级图划分聚类方法. 武汉大学学报(信息科学版). 2021(01): 19-29 . 百度学术
    3. 张铭龙,何贞铭. 基于因子分析法的城市商业中心抽取研究. 地理空间信息. 2021(08): 58-60+64+5 . 百度学术
    4. 李卫东,张铭龙,段金龙. 基于POI数据的南京市空间格局定量研究. 世界地理研究. 2020(02): 317-326 . 百度学术

    其他类型引用(3)

图(9)  /  表(2)
计量
  • 文章访问数:  306
  • HTML全文浏览量:  78
  • PDF下载量:  66
  • 被引次数: 7
出版历程
  • 收稿日期:  2022-12-11
  • 网络出版日期:  2023-06-11
  • 发布日期:  2023-06-04

目录

    /

    返回文章
    返回