A Large-Scale Surface Reconstruction Method for Aerial Imagery Based on 3D Gaussian Splatting
-
Abstract
Objectives: Large-scale surface reconstruction has long been a central topic in both academia and industry, particularly in aerial surveying, 3D urban modeling, and smart-city applications. Although methods based on Neural Radiance Fields (NeRF) have achieved remarkable results in novel view synthesis and shown potential for surface reconstruction, their substantial computational overhead limits their applicability to large-scale scenes. 3D Gaussian Splatting (3DGS) effectively alleviates this limitation by leveraging explicit Gaussian primitives and a rasterization-like forward-splatting pipeline that is highly efficient on modern GPUs. Despite its impressive performance in high-fidelity novel view synthesis, directly applying 3DGS to large-scale aerial imagery—where weak textures, shadows, and massive scene extents are common—remains challenging. This study aims to address these issues and explore the feasibility of employing 3DGS for large-scale aerial surface reconstruction. Methods: We propose Aerial Gaussian Splatting (AGS), a 3DGS-based large-scale surface reconstruction framework specifically designed for aerial scenarios. To overcome memory bottlenecks and computation constraints caused by extensive aerial scenes, AGS introduces an adaptive aerial scene partitioning strategy. This strategy utilizes camera distribution and Structure-from-Motion (SfM) derived visibility relationships to divide the entire region into multiple spatially coherent and independent blocks. Each block can be optimized in parallel and later merged seamlessly, ensuring global geometric consistency while significantly reducing per-block memory consumption. To improve reconstruction in weak-texture and shadowed regions, AGS incorporates the existing Ray-Gaussian Intersection (RGI) technique to extract unbiased depth and normal information directly from Gaussian primitives. Building upon RGI, we further propose a Depth-Gradient Enhanced Optimization (DGE) module, which introduces depth-gradient priors obtained from a monocular depth estimation model. This enhances geometric constraints and improves detail preservation under visually ambiguous conditions. In addition, AGS employs a multi-view depth alignment strategy that enforces geometric consistency between training views and their neighboring viewpoints through a projection-reprojection mechanism, compensating for the geometry-insufficient nature of the original 3DGS pipeline. Results: We evaluated the proposed framework on the WHU-OMVS and Tianjin aerial datasets. Experimental results demonstrate that AGS achieves superior depth estimation accuracy compared with existing state-of-the-art 3DGS-based methods and also outperforms conventional multi-view stereo reconstruction software such as Colmap. Furthermore, AGS achieves high-quality rendering performance on the Mill-19 and UrbanScene3D datasets, consistently surpassing competing approaches in visual quality. Conclusions: This work presents AGS, a dedicated extension of 3DGS for large-scale aerial surface reconstruction. By integrating adaptive scene partitioning, geometry-aware rendering, depth-gradient enhanced optimization, and multi-view depth alignment, AGS effectively addresses the challenges posed by large scene extents, weak textures, and limited geometric supervision. The results validate the feasibility of applying 3DGS to large-scale aerial surface reconstruction and highlight its potential for broader applications in urban mapping, geospatial analysis, and environmental monitoring.
-
-