Abstract
Objectives: In the field of autonomous driving, acquiring high-precision pose information is a critical factor in ensuring its safety and reliability. Although the global navigation satellite system (GNSS) can provide absolute pose references, its positioning accuracy significantly degrades in complex scenarios due to signal obstruction and multipath effects, making it difficult to meet the stringent requirements of autonomous driving systems. To address this challenge, we have conducted systematic exploration into precise pose acquisition in large-scale, complex, and dynamic environments. Methods: First, based on multisensor fusion technology, we developed a reconstruction system tailored for dynamic urban environments, focusing on three representative large-scale outdoor scenarios: urban roads, tree-lined avenues, and hightraffic highways. A high-precision 3D Gaussian splatter (3DGS) map was constructed for these scenarios. In the visual front-end, we integrated the SuperPoint feature extraction algorithm with the SuperGlue feature matching algorithm to achieve high-quality feature point detection and matching. Additionally, we introduced the perspective-n-point (PnP) algorithm, which estimates the initial camera pose by matching feature points with corresponding points in the 3D map.To optimize positioning results and improve the system's real-time performance, we proposed an iterative optimization strategy based on progressive rendering. This strategy utilizes the geometric structure and radiometric field characteristics of the 3DGS map, combining step-by-step rendering and optimization to continuously adjust the camera pose estimation, ultimately achieving progressive optimization of real-time high-precision positioning results. Results: The experimental validation further demonstrates the following results: (1) In dynamic urban environments, the reconstruction system exhibits exceptional performance, effectively rendering continuous and realistic lighting and texture effects in real time. Particularly in complex urban settings, the system handles issues such as lighting changes and occlusions caused by dynamic objects effectively. (2) The feature extraction and matching algorithms, SuperPoint and SuperGlue, significantly outperform traditional methods in terms of both the quantity and quality of feature points. Compared to traditional methods, the number of extracted feature points has been greatly increased, with a feature point distribution rate reaching 70%, more than twice that of conventional methods. (3) The system demonstrates excellent positioning accuracy in three typical outdoor scenarios. On urban roads, tree-lined avenues, and high-traffic highways, the system achieves average positioning accuracies of 0.026 m, 0.029 m, and 0.081 m, respectively, which are significantly better than other representative methods. Additionally, the system achieves a frequency of 96% for absolute displacement and rotation errors smaller than 10 cm and 1°, respectively, indicating high stability in positioning accuracy and the ability to provide reliable real-time positioning results in various environments. (4) Ablation experiments further validate the effectiveness of the system's design. The complete model achieves the best results in both typical scenarios, with average absolute translation and rotation errors of 3.22 cm and 0.69°, respectively. Compared to experiments where dynamic occlusion handling, SuperPoint feature extraction, and iterative optimization strategies were removed, the system's overall design demonstrates superior performance. (5) Finally, the system's real-time performance has been verified. Time analysis shows that the system processes each frame of images in approximately 1 s, which meets the real-time requirements and is suitable for real-time application scenarios. Conclusions: In large-scale outdoor environments, the visual relocalization technology based on the 3DGS map enables high-precision location information retrieval, effectively compensating for the limitations of existing positioning methods under GNSS signal occlusion or in complex environments. Through multi-sensor fusion and precise map construction, the system performs exceptionally well in dynamic urban scenarios. Particularly in complex urban environments, the issues of lighting changes and dynamic object occlusion are effectively handled, significantly enhancing the system's robustness and adaptability. The front-end feature extraction and matching methods greatly improve the system's stability and accuracy. In complex dynamic scenes, the distribution of feature points is more uniform, further enhancing positioning precision. Compared to various methods, the system demonstrates optimal performance. Ablation experiments further validate the necessity and effectiveness of these techniques. The system's processing speed meets real-time requirements, ensuring efficient and reliable real-time localization.