Abstract:
Objectives: Intelligent Connected Vehicles (ICVs) equipped with cameras, LiDAR, GNSS, and IMU continuously enhance their spatiotemporal perception capability, creating a new opportunity for low-cost and high-frequency production and updating of maps for autonomous driving. Conventional centralized mapping, which relies on professional mobile mapping systems, is limited by high cost, low update frequency, and restricted spatial coverage. To address these challenges, this study aims to summarize the key technologies of crowdsourced perception mapping for ICVs, establish a vehicle–cloud collaborative technical framework, and analyze its practical value for large-scale map production and updating.
Methods: Starting from the vehicle–cloud collaborative processing requirements of multi-source spatiotemporal perception data, including video, point clouds, and trajectories, a crowdsourced perception mapping framework for ICVs is proposed. Within this framework, the vehicle side is responsible for real-time sensing, localization, semantic extraction, and local vectorized mapping, while the cloud side performs large-scale map learning, road network generation, differential updating, automated annotation, and model optimization. On the vehicle side, the study reviews key technologies for local mapping, including multi-sensor localization, SLAM-based mapping, and BEV-perception-based vectorized map construction. On the cloud side, key technologies for global mapping are analyzed from three aspects: road network extraction from crowdsourced trajectories, differential updating of map elements, and automated annotation of map data. Furthermore, a practical crowdsourced mapping system jointly developed with an automotive enterprise is introduced to validate the proposed framework.
Results: The proposed vehicle–cloud crowdsourced perception mapping framework supports a full workflow covering data acquisition, local processing, data upload, cloud fusion, automated map production, quality inspection, map editing, review, and distribution. In the enterprise practice, the system achieved extraction of more than 100 mapping elements in four major categories, including road network information, ground markings, roadside facilities, and overhead signs. On the vehicle side, an integrated technical chain of multi-sensor calibration, reliable perception, and stable local mapping was established, enabling high-precision positioning and vectorized extraction of local map elements. On the cloud side, a map learning and production pipeline was developed to support large-scale crowdsourced road network construction, map element generation, change detection, and automated annotation. The practice demonstrated the engineering feasibility of crowdsourced perception mapping for complex urban scenes and verified its capability for scalable map production and rapid updating.
Conclusions: Crowdsourced perception mapping provides an effective technical route for overcoming the limitations of conventional centralized mapping and is becoming an important paradigm for the production and maintenance of autonomous driving maps. By integrating vehicle-side local mapping and cloud-side global mapping, the proposed framework enables low-cost, wide-coverage, and continuously updated map construction. The enterprise case confirms its practical applicability for extracting diverse map elements and supporting large-scale map generation. Future work should further address challenges in secure and standardized data governance, robustness of low-cost vehicle-side perception, generalization of cloud-side multi-source mapping algorithms, and the construction of spatiotemporal information infrastructure for large-scale autonomous driving applications.