视频数据驱动的公路车辆多维度信息精准感知方法

赵耀纪; 谢亚坤; 李雨霏; 陈铭臻; 涂佳星; 冯德俊; 胡亚

doi:10.13203/j.whugis20250155

摘要: 车辆多维度信息提取是公路管理信息化建设不可或缺的一部分，当前研究主要以视频车辆检测为主，难以通过监控视频获取公路场景中车辆速度、颜色、车道位置等信息。因此，提出一种视频数据驱动的公路车辆多维度信息精准感知方法，首先，提出一种尺度自适应公路车辆检测方法，通过改进的C3k2_DSConv模块，提高模型对于不同尺度和形态车辆目标的适应性，并利用Dynamic Head检测头的卷积核调整策略，增强网络对于视角及遮挡等条件下的检测能力；其次，探讨车辆多维度信息差异性，分别建立车辆速度、颜色、车流量及车辆车道位置信息等多维度信息感知方法；最后，为验证方法的有效性，构建一个新的公路车辆检测数据集，并于真实案例场景中开展案例实验分析，结果表明车辆检测mAP@0.5为95.4%，车流量统计的准确率达99%，车辆车道标定准确率达92%，车辆颜色识别的准确率为81%，车辆速度检测平均检测绝对误差为9.36%，可以实现复杂场景下公路场景车辆多维度信息精准感知。

Abstract: Objectives: With the rapid increase in vehicle ownership and the growing complexity of road environments, traditional traffic management methods face dual challenges of perception accuracy and response efficiency in dense and dynamic highway scenarios. Video surveillance, as a low-cost and real-time sensing approach, has become an essential data source for vehicle information collection. However, existing video-based studies primarily focus on vehicle detection and often fail to achieve accurate perception of multidimensional attributes such as speed, color, traffic flow, and lane position. To address existing limitations, a video data-driven framework is proposed for precise perception of multidimensional vehicle information on highways, aiming to enhance the comprehensiveness and reliability of traffic perception. Methods: First, a scale-adaptive highway vehicle detection method is introduced. The proposed C3k2_DSConv module enhances the model’s adaptability to vehicles of various scales and morphologies by integrating depthwise separable convolutions and feature reuse strategies. A convolution kernel adjustment mechanism within the Dynamic Head is further designed to improve the model’s detection robustness under diverse viewing angles and occlusion conditions. Second, based on the detected vehicle bounding boxes, multidimensional vehicle attributes are derived through specialized perception algorithms: (1) vehicle speed is estimated using a dual-virtual-line timing model, which computes velocity through frame-based centroid tracking. (2) vehicle color is identified by combining RGB-to-HSV color space transformation with feature enhancement through region-based color histogram mapping. (3) traffic flow is calculated via a virtual line constraint counting method that minimizes duplicate detection. (4) lane position is calibrated using a virtual bounding-box constraint model that aligns vehicle trajectories with detected lane boundaries. To validate the proposed framework, a new highway vehicle dataset is constructed from real-world monitoring footage, incorporating variations in vehicle types, illumination, and scale to ensure robustness and generalization. Results: Comprehensive experiments were conducted using state-of-the-art detectors including YOLOv5~YOLOv13, RT-DETR, SSD, Faster R-CNN, and RetinaNet for performance comparison. The proposed method achieved superior results, with precision (P), recall (R), and mAP@0.5 of 0.903, 0.899, and 0.954, respectively, outperforming all baseline models. Furthermore, multidimensional perception experiments showed that the mean Average Precision (mAP@0.5) for vehicle detection reached 95.4%, traffic flow accuracy was 99%, lane calibration accuracy reached 92%, vehicle color recognition accuracy was 81%, and the average absolute error for vehicle speed detection was 9.36%. These quantitative results confirm the effectiveness and robustness of the proposed framework in complex highway environments. Conclusions: The framework demonstrates that video-based methods can achieve high-precision perception of multiple vehicle attributes when enhanced with adaptive feature extraction and multidimensional information modeling. It enables comprehensive perception of speed, color, traffic flow, and lane position, thereby providing a reliable basis for intelligent highway management, traffic law enforcement, and accident early warning systems.

视频数据驱动的公路车辆多维度信息精准感知方法

Video Data-Driven Approach to Accurate Sensing of Multidimensional Information of Road Vehicles