利用局部几何特征与空洞邻域的点云语义分割

项学泳; 李广云; 王力; 宗文鹏; 吕志鹏; 向奉卓

doi:10.13203/j.whugis20200567

利用局部几何特征与空洞邻域的点云语义分割

Semantic Segmentation of Point Clouds Using Local Geometric Features and Dilated Neighborhoods

摘要

摘要: 点云具有数据量大、无拓扑结构等特点，现有的深度学习语义分割模型难以充分挖掘大范围邻域内点云中所隐藏的几何特征。由此提出了一种基于空洞邻域并结合角度等几何特征作为模型输入的点云语义分割模型。首先，在局部邻域构建过程中，将图像处理的空洞卷积操作扩展至点云，建立空洞邻域结构，以扩大感受野；然后，在特征提取过程中，将中心点与邻域点之间相对坐标、距离、角度等基本几何特征作为模型输入，最大程度挖掘邻域内的几何特征；最后，基于所提邻域结构与特征提取算法构建了点云语义分割模型。采用Semantic3D数据集进行实验验证，结果表明, 所提模型分割效果优于对比的点云语义分割算法，空洞邻域与局部几何输入特征能够有效改善点云语义模型的性能。

Abstract:
Objectives Point cloud has no topological structure, current deep learning semantic segmentation algorithm is difficult to capture geometric features implied in irregular points. In addition, the point cloud is in three-dimensional space with a large amount of data size. If we blindly expand the captive field size during extract neighborhood information, it will increase the number of model parameters, which will make model training difficult.
Methods We propose a point cloud semantic segmentation model based on the dilated convolution and combining elementary geometric features such as angle as the model input. First, during feature extraction, basic geometric features such as the relative coordinates, distance and angle between the centroid and the neighboring points are used as the model input to mine the geometric information. Second, in the process of building local neighborhoods, we expand the image dilated convolution operator to point cloud processing, the point cloud dilated operator can expand the receptive field size with no increasing the number of parameters of the model. Finally, the dilated convolution operator, multi-geometric features encoding modules and U-Net architecture are combined to form a complete point cloud semantic segmentation model.
Results The results show that compared with the traditional neighborhood structure, the overall accuracy (OA) of dilated neighborhood structure is increased by 1.4%. Compared with the model that only uses coordinates as input, multi-geometric features encoding module is increased by 10.7%. The final model based on the two proposed algorithms get mean intersection over union and OA are 91.2% and 68.2%, respectively.
Conclusions The dilated neighborhood structure can effectively extract point cloud information in a larger range without increasing the number of model parameters. multi-geometric features encoding module can maximize the capture of shape information in the neighborhood.

HTML全文

参考文献(24)

施引文献

资源附件(0)