基于MS-DeepLabV3+的街景语义分割及城市多维特征识别

柳林; 马泽鹏; 孙毅; 李万武; 项子诚

doi:10.13203/j.whugis20220773

基于MS-DeepLabV3+的街景语义分割及城市多维特征识别

Semantic Segmentation of Street View and Multi-dimensional Feature Identification of City Based on MS-DeepLabV3+

摘要

摘要: 传统城市特征识别采用空间和统计方法提取分析指标,特征评价指标主观性较大。街景影像包含城市视觉信息,可以进行城市特征识别。以中国青岛市为例,构建面向街景的多尺度语义分割模型MS-DeepLabV3+。在编码区增加全特征提取通道聚合多尺度特征；在解码区增加多尺度特征提取通道,有效捕捉低层次特征；引入注意力机制模块和通道注意力,聚焦关键特征,提高街景语义分割的准确性,模型平均交并比、精确率和召回率分别提高了3.47%、2.37%和3.96%。在地块尺度上,从6个维度建立了城市多维特征向量,即环境维度、设施便利维度、经济富裕度、交通维度、城市安全维度和城市综合度,结合兴趣点数据和居住用地数据,以表征青岛市各城区的城市特征。使用Grad-CAM方法对语义分割模型进行可解释分析,采用特征归因SHAP方法挖掘了城市多维特征的内在驱动因素。结果发现,不同城区具有不同的特征向量,不同城区的特征向量具有在特定维度上的优势。研究结果有助于优化城市空间中多维度特征,为城市的规划建设提供参考。

Abstract:
Objectives The traditional methods of identifying urban features use spatial and statistical algorithms to extract analysis indicators, but feature evaluation indicators are very subjective. Street view images contain visual information of the city and can be used to identify urban features.
Methods Taking Qingdao, China as an example, this paper proposes a multi-scale semantic segmentation model, named MS-DeepLabV3+,based on street view images. The proposed model adds full feature extraction channels in the encoding process to aggregate multi-scale features, and adds multi-scale feature extraction channels in the decoding process to effectively capture low-level features. And convolutional block attention module and efficient channel attention modules focusing on key features are introduced to improve the accuracy of semantic segmentation of street views. The mean intersection over union, accuracy and recall of the proposed model have been increased by 3.47%, 2.37% and 3.96%, respectively. We build a multi-dimensional feature vector of the city in six dimensions, including environment dimension, facility convenience dimension, economic affluence dimension, transportation dimension, urban safety dimension and urban synthesis dimension. Based on the semantic segmentation results of the street view images, the data are combined with the point-of-interest data and residential land use data. At the plot scale, we extract the feature vectors and calculate the values in six dimensions to characterize the urban features of each urban area in Qingdao. This paper uses the Grad-CAM method for interpretable analysis of semantic segmentation models and the feature attribution SHAP method to mine the intrinsic drivers of multi-dimensional features in cities.
Results Different urban areas have different feature vectors, and the feature vectors of different urban areas have the advantages in specific dimensions.
Conclusions The above analysis helps optimize the multi-dimensional features in urban space for the planning and construction of cities.

HTML全文

参考文献(22)

施引文献

资源附件(0)