利用社会公众拍摄图像构建城市内涝深度等级评估模型

Constructing an Urban Waterlogging Depth Assessment Model Using Public Video Information

  • 摘要: 衡量城市内涝严重程度的标准除了积水范围外,积水深度是更为重要的一个标准。面向传统城市内涝积水状态监测方法存在成本高昂、监测范围有限以及仅能针对特定易涝区域进行监测的局限性,提出了一种利用社会公众拍摄的图像信息构建的城市内涝深度等级评估模型。我们利用爬虫技术以“城市内涝”和“车辆受淹”等词语作为关键词,从互联网上收集社会公众拍摄的包含涉水小型车辆的图像和视频构建样本集,采用车辆淹没深度作为参照标准,将内涝深度划分为安全(0-20cm)、不安全(20-60cm)和危险(>60cm)三个等级,并为这些等级创建了相应的标签。应用Yolov8模型构建了Yolov8l,Yolov8m,Yolov8n,Yolov8s四组不同网络复杂度的模型,训练过程中,利用混淆矩阵和精确度(Precision)和召回率(Recall)、平均精度均值(mAP)、平均精度均值精度(AP)四种精度指标对模型的准确性进行了综合评估。实验结果证实,Yolov8算法能够有效识别不同场景下车辆的淹没状态,且对各危险等级的车辆识别平均精确度达到了70%。这一精度水平表明,该模型能够作为一种有效的工具,用于监测和评估城市内涝的积水深度等级。本研究不仅提高了内涝灾害评估的时效性和准确性,还拓展了社会信息在灾害管理中的应用,为未来的城市灾害管理提供了一种人工智能与社会公众信息相结合的解决方案。

     

    Abstract: Objectives: In addition to the extent of water accumulation, the depth of water accumulation is a more important criterion for measuring the severity of urban waterlogging. In response to the limitations of traditional urban waterlogging monitoring methods, such as high costs, limited monitoring range, and the ability to only monitor specific flood-prone areas, a model for assessing the depth levels of urban waterlogging using image information captured by the public has been proposed. Methods: We used web crawling techniques with keywords like "urban waterlogging" and "vehicles submerged" to collect images and videos from the internet that include small vehicles in water, creating a sample set. Using the depth of vehicle submersion as a reference standard, we divided the waterlogging depth into three levels: safe (-20cm), unsafe (20-60cm), and dangerous (>60cm), and created corresponding labels for these levels. We applied the YOLOv8 model to construct four models of different network complexities: YOLOv8l, YOLOv8m, YOLOv8n, and YOLOv8s. During training, we used a confusion matrix and four accuracy metrics—Precision, Recall, mean Average Precision (mAP), and Average Precision (AP)—to comprehensively evaluate the model's accuracy. Results: (1) For the training set, all four YOLOv8 models converged after about 40 training epochs, with Precision, Recall, and mAP50 values stabilizing around 60%-70%, and mAP50:95 values around 35%. (2) In the test set, for the Precision metric, YOLOv8m scored the highest for the safe class at 80.7%, YOLOv8l for the unsafe class at 55.4%, and YOLOv8s for the dangerous class at 84.8%. For the Recall metric, YOLOv8m scored the highest for both the safe and unsafe classes at 66.3% and 68.8%, respectively, while YOLOv8n scored the highest for the dangerous class at 68.3%. For the mAP50 metric, YOLOv8s scored the highest for all classes, with 74.4% for safe, 61.7% for unsafe, and 76.2% for dangerous. For the mAP50:95 metric, YOLOv8m scored the highest for the safe class at 43.1%, while YOLOv8s scored the highest for both the unsafe and dangerous classes at 37.7% and 36.7%, respectively. (3) It should be noted that the depth and width of the network, as well as the ability to extract features, also have a significant impact on model performance. Although increasing the width and depth of the network significantly increases the number of model parameters and computational load, not all evaluation metrics increase with the width and depth of the network. Although the YOLOv8l model is the most complex, its Precision and mAP50 scores are not dominant, ranking fourth and third at 69.3% and 68.0%, respectively, while its Recall and mAP50:95 scores are the highest. Despite the YOLOv8n model having the fewest parameters and computational load, its mAP50 score also achieved a second-place ranking. Overall, YOLOv8m performed the best on the validation set, with the highest scores in Precision, mAP50, and mAP50:95, with Precision and mAP50 being 2.2% and 2.6% higher than the second place, respectively, and without much sacrifice in Recall, making it the best overall performer. (4) The detection accuracy of the four models for the unsafe class is lower than that of the other two levels. On the one hand, this is due to the imbalance in the number of samples (unsafe class samples are 68% of safe class samples and 59% of dangerous class samples). On the other hand, it is also due to certain limitations in the model's detection performance, such as vehicles in motion easily splashing water, leading to misjudgment of water levels; when the water level is just at the halfway point of the tire, the unsafe class is easily confused with the safe class, and when the water level is just at the exhaust outlet of the engine, it is easily confused with the dangerous class, thus the probability of correctly detecting the unsafe class is the lowest. To address this issue, future research will focus on targeted data augmentation techniques for sample processing to improve the model's recognition ability for unsafe class samples, and introduce attention mechanism modules into the model network structure to enhance the model's learning ability for key features.

     

/

返回文章
返回