基于PMVS算法的大规模数据细粒度并行优化方法

刘金硕; 李扬眉; 江庄毅; 邓娟; 眭海刚; PANJeff

doi:10.13203/j.whugis20160186

基于PMVS算法的大规模数据细粒度并行优化方法

Fine-Grained Parallel Optimization of Large-Scale Data for PMVS Algorithm

摘要

摘要: 三维多视角立体视觉算法（patch-based multi-view stereo，PMVS）以其良好的三维重建效果广泛应用于数字城市等领域，但用于大规模计算时算法的执行效率低下。针对此，提出了一种细粒度并行优化方法，从任务划分和负载均衡、主系统存储和GPU存储、通信开销等3方面加以优化；同时，设计了基于面片的PMVS算法特征提取的GPU和多线程并行改造方法，实现了CPUs_GPUs多粒度协同并行。实验结果表明，基于CPU多线程策略能实现4倍加速比，基于统一计算设备架构（compute unified device architecture，CUDA）并行策略能实现最高34倍加速比，而提出的策略在CUDA并行策略的基础上实现了30%的性能提升，可以用于其他领域大数据处理中快速调度计算资源。

Abstract: We address the problem of fine-grained parallel optimization of large-scale data. Patch-based multi-view stereo (PMVS) algorithm has been widely applied to digital city and other fields because of its good three-dimensional reconstruction effect, however, its large-scale computing algorithm has a low execution efficiency. Therefore, to address the limitation, this paper proposes a fine-grained parallel optimization method, including task allocation and load-balancing; strategies of main system memory and GPU memory; the optimization of communication. We perform CPU multi-threading operation using the pthreads function library to take full advantage of the computing power of multi-core CPUs. And for GPUs, we utilize the CUDA framework while optimizing thread organization and memory access. Besides that, we propose the idea of adapting memory pool model and pipelining model to improve bandwidth availability ratio. The memory pool model reduces the impact of data resources transferring on the bus for CPUs_GPUs while waiting for resources; the pipelining model hides communication time for CPU to read data from memory. At the same time, this paper utilizes the Harris-DOG feature extraction of PMVS algorithm of sequences of images as the example to verify our optimization strategies. The experiments demonstrate that the multi-threading CPU-based strategy can achieve 4 times speed-up ratio, the highest ratio that parallel CUDA-based strategy can achieve is 34 times, and our strategy can improve the performance 30% on the basis of the parallel CUDA-based strategy. In the future, our optimization strategy can be applied to quick computing resource scheduling in big data processing of other domains.

HTML全文

参考文献(22)

施引文献

资源附件(0)