利用散斑引导和扩散去噪的主动立体匹配方法

Active Stereo Matching Method Using Speckle Guidance and Diffusion Denoising

  • 摘要: 传统立体匹配通常无法处理弱纹理区域,而主动立体匹配通过向场景投射结构光,可有效缓解因纹理缺少而无法匹配的问题。近年来,基于深度学习的主动立体匹配方法取得了显著进展,其中利用引导代价体策略最具代表性。但目前的方法仍然面临两大挑战: 1)用于引导代价体的引导信号通常较为稀疏且分布不均; 2)构建的代价体中往往包含大量的冗余信息。为此,本文提出了一种新的主动立体匹配框架SGDM(Speckle-Guided Diffusion Matching)。该框架引入散斑变权引导机制提高引导信号的利用效率,并利用扩散过滤器抑制代价体噪声,提升匹配的准确性。具体地,针对引导信号稀疏难以利用的问题,SGDM引入稀疏视差填充模块扩充视差数量,并利用变权高斯引导模块优化扩展点的匹配代价,以提升引导信号的覆盖率和可靠性;针对代价体中信息冗余影响匹配精度的问题,SGDM通过引入扩散过滤器以递归方式对代价体进行逐步去噪,提升代价体的精度,生成高质量的密集视差图。在仿真和真实数据集上进行的实验结果显示,融合SGDM模块后,各主流立体匹配网络的整体性能均显著提升:融合SGDM后的ACVNet模型在Scene Flow数据集上的平均端点误差下降约16%,在SimStereo数据集上的大于1像素误差从16.60%显著降低至4.64%;融合SGDM后的RAFT模型在ETH3D和Middlebury等真实数据集上的误差率平均降低约26%。这些结果充分验证了SGDM在提升立体匹配精度和跨域泛化能力方面的有效性和通用性。

     

    Abstract: Objectives: Depth estimation is a fundamental task in computer vision and 3D reconstruction, with extensive applications in autonomous driving, robotics, and virtual reality. Traditional stereo matching algorithms often fail in weakly textured or reflective regions, leading to inaccurate disparity estimation. Active stereo matching, which projects structured light such as speckle patterns onto the scene, can alleviate these issues by enriching image textures. However, existing active stereo methods that rely on guided cost volumes still suffer from two main limitations: sparse and unevenly distributed guidance signals, and redundant information in the constructed cost volumes. To address these challenges, a novel active stereo matching framework named SGDM (Speckle-Guided Diffusion Matching) is proposed. Method: The proposed SGDM framework integrates three key modules to enhance the utilization of sparse guidance and suppress redundant cost-volume information. (1) A lightweight sparse disparity filling module expands the initially sparse disparity map derived from stereo pairs illuminated by speckle structured light, increasing the number and spatial uniformity of effective disparity points. (2) A variable-weight Gaussian guidance module adaptively adjusts the modulation strength according to the confidence of disparity points, thereby improving the reliability and precision of the guided cost volume. (3) A diffusion filter is employed to iteratively refine the cost volume through a diffusion-based denoising process, effectively suppressing redundant information and noise to yield a cleaner and more distinctive disparity estimation. These components can be seamlessly integrated into existing stereo matching networks without modifying their original architectures. Results: Comprehensive experiments were conducted on both synthetic and real-world datasets, including Scene Flow, KITTI, SimStereo, ETH3D, and Middlebury. The results show that the proposed SGDM framework substantially enhances the accuracy and robustness of stereo matching networks. Specifically, when the proposed SGDM is integrated into ACVNet, the average end-point error (EPE) on the Scene Flow dataset decreases by approximately 16%, and the proportion of pixels with disparity errors greater than one pixel on SimStereo is reduced from 16.60% to 4.64%. Furthermore, incorporating SGDM into RAFT leads to an average error reduction of 26% on ETH3D and Middlebury datasets, demonstrating remarkable cross-domain generalization. Visual comparisons further confirm that SGDM effectively improves disparity estimation in weakly textured and occluded regions, yielding sharper and more detailed disparity maps. Conclusion: The proposed SGDM framework effectively addresses two critical limitations in active stereo matching: sparse and uneven guidance signals, and redundant information within the cost volume. By integrating speckle-guided semi-dense disparity priors, a variable-weight Gaussian guidance mechanism, and a diffusion-based denoising filter, SGDM achieves highly accurate, noise-resilient, and generalizable depth estimation. Experimental evaluations across multiple datasets verify that SGDM consistently enhances both matching precision and cross-domain robustness when embedded into mainstream stereo matching networks. In addition to its algorithmic contributions, SGDM offers a scalable and flexible design that can be readily incorporated into various 3D perception systems. Overall, the proposed framework provides a unified and effective solution for advancing active stereo matching, paving the way for broader applications in autonomous driving, robotic perception, and precision 3D reconstruction.

     

/

返回文章
返回