王紫娇, 许春燕, 周传伟, 崔振. 利用渐进式动态图进行图像分割[J]. 武汉大学学报 ( 信息科学版), 2024, 49(7): 1212-1223. DOI: 10.13203/j.whugis20220070
引用本文: 王紫娇, 许春燕, 周传伟, 崔振. 利用渐进式动态图进行图像分割[J]. 武汉大学学报 ( 信息科学版), 2024, 49(7): 1212-1223. DOI: 10.13203/j.whugis20220070
WANG Zijiao, XU Chunyan, ZHOU Chuanwei, CUI Zhen. Progressive Dynamic Graph Network for Image Segmentation[J]. Geomatics and Information Science of Wuhan University, 2024, 49(7): 1212-1223. DOI: 10.13203/j.whugis20220070
Citation: WANG Zijiao, XU Chunyan, ZHOU Chuanwei, CUI Zhen. Progressive Dynamic Graph Network for Image Segmentation[J]. Geomatics and Information Science of Wuhan University, 2024, 49(7): 1212-1223. DOI: 10.13203/j.whugis20220070

利用渐进式动态图进行图像分割

Progressive Dynamic Graph Network for Image Segmentation

  • 摘要: 大多数先进的基于深度学习的图像分割算法缺乏结合图像上下文关系的能力,忽略了上下文信息对分割轮廓的作用及影响,使得算法性能的提升有所局限,为此提出了一种基于轮廓的图像分割方法,利用一种渐进式动态图网络进行轮廓的变形。具体地,根据目标轮廓的拓扑结构,在轮廓上采样顶点将其转变成一个动态图,通过扩散目标点的上下文信息进行推理学习,并积累历史学习经验来进行轮廓图的动态更新,通过一种端到端的方式进行图的更新和目标位置的预测,并将其封装成一个闭环的学习过程。在Cityscapes、KINS、SBD数据集上进行了测试,验证了所提方法在实时实例分割上的有效性和实时性。具体地,在Cityscapes和SBD数据集上分别达到了最好的性能:34.4%和55.3%平均精度(average precision, AP),在KINS数据集上也达到了30.5% AP的性能,实现了很好的分割拟合效果,相比于Deep Snake模型,其分割轮廓与目标真实边界更加拟合且轮廓线更加平滑。此外,就运行速度而言,动态图模型在3个数据集上也达到了最好的结果,分别为3.8 帧/s、7.6 帧/s和13.1 帧/s,比Deep Snake模型的运行速度平均提升了0.5 帧/s。

     

    Abstract:
    Objectives Image segmentation is one of the important subjects in the field of image processing and computer vision. Image segmentation has a wide range of applications in robot perception, autonomous driving, medical image analysis, scene understanding, video surveillance, virtual reality and augmented reality, but it also faces many challenges. For example, it is necessary to rely on the context to obtain more precise results in complex scenes. Deep learning-based image segmentation algorithms have made great progress, but they also face many problems, especially most algorithms lack the ability of combining image context, and they ignore the role and influence of the contextual information on the segmentation contour, which limits the improvement of algorithm performance. For this reason, we propose a contour-based image segmentation approach, which uses a progressive dynamic graph network to deform the contour.
    Methods Specifically, according to the topology of object contour, we first sample the vertices from the contour to convert them into a dynamic graph. We abandon the disorder of the graph and the uncertainty of the adjacency relationship, to construct a graph with a fixed topology. And then we perform inference learning by diffusing contextual information of the object points. We also progressively accumulates historical learning experience to update the contour graph dynamically. We adopt an end-to-end method to update the graph and predict object position, and encapsulate them into a closed-loop learning process. The proposed approach fully takes into account context and the historical learning experience. It also makes up for the shortcomings of the pixel-wise segmentation methods and traditional active contour models.
    Results Experiments on Cityscapes, KINS, SBD datasets show that the effectiveness and speed of the proposed approach in real-time instance segmentation. Specifically, the dynamic graph model achieves the best performances on Cityscapes and SBD datasets, which are 34.4% and 55.3% average precision(AP) respectively, and 30.5% AP on KINS dataset. It also achieves a better fitting effect on KINS dataset in terms of visualization results. Compared with deep snake model, the segmentation contours extracted by dynamic graph model fit the real boundary better and the contours are smoother. In addition, in terms of running speed, the dynamic graph model also obtains best results on three datasets, which are 3.8, 7.6 and 13.1 frame/s respectively. The proposed method has an average improvement of 0.5 frame/s on three datasets when compared with deep snake model.
    Conclusions Progressive dynamic graph makes better use of the topology of object contour and the characteristics of dynamic update. By disseminating context information and historical information, this method solves the problem that segmentation algorithms lack the ability of combining context. Future research can also face the following aspects: Designing a more efficient and robust network architecture, considering the objects around the target to combine the context information of scene in the image,and attempting to resolve the issue of poor convergence on non-convex objects or concave regions.

     

/

返回文章
返回