可学习深度位置编码引导的车前图像道路可行驶区域检测

韩汀, 陈思宇, 马津, 蔡国榕, 张吴明, 陈一平

韩汀, 陈思宇, 马津, 蔡国榕, 张吴明, 陈一平. 可学习深度位置编码引导的车前图像道路可行驶区域检测[J]. 武汉大学学报 ( 信息科学版), 2024, 49(4): 582-594. DOI: 10.13203/j.whugis20230252
引用本文: 韩汀, 陈思宇, 马津, 蔡国榕, 张吴明, 陈一平. 可学习深度位置编码引导的车前图像道路可行驶区域检测[J]. 武汉大学学报 ( 信息科学版), 2024, 49(4): 582-594. DOI: 10.13203/j.whugis20230252
HAN Ting, CHEN Siyu, MA Jin, CAI Guorong, ZHANG Wuming, CHEN Yiping. Road Image Free Space Detection via Learnable Deep Position Encoding[J]. Geomatics and Information Science of Wuhan University, 2024, 49(4): 582-594. DOI: 10.13203/j.whugis20230252
Citation: HAN Ting, CHEN Siyu, MA Jin, CAI Guorong, ZHANG Wuming, CHEN Yiping. Road Image Free Space Detection via Learnable Deep Position Encoding[J]. Geomatics and Information Science of Wuhan University, 2024, 49(4): 582-594. DOI: 10.13203/j.whugis20230252

可学习深度位置编码引导的车前图像道路可行驶区域检测

基金项目: 

国家自然科学基金 42371343

详细信息
    作者简介:

    韩汀,博士生,主要从事图像和点云的语义分割理论与方法研究。ting.devin.han@gmail.com

    通讯作者:

    陈一平,博士,副教授。chen79@mail.sysu.edu.cn

  • 中图分类号: P208

Road Image Free Space Detection via Learnable Deep Position Encoding

  • 摘要:

    道路可行驶区域检测是汽车辅助驾驶系统中场景感知的关键基础。基于卷积神经网络的方法因难以获取全局上下文信息而易产生道路空洞和中断等完整性问题,而基于Transformer的方法缺乏局部理解,容易造成边界的错位和越界问题。为了克服上述两类方法的缺陷,提出了一种可学习深度位置编码引导的金字塔Transformer网络架构,融合卷积神经网络与Transformer进行道路可行驶区域检测。该框架建立金字塔Transformer主干网从全局感受野提取道路特征,并结合局部窗口注意力弥补细节损失,以收缩自注意力提升特征计算效率。针对Transformer中传统位置编码忽略像素与实际场景空间关联性的问题,提出用深度图像卷积特征构建可学习位置编码的方法,解决现实关联性脱节引起的注意力偏移和语义不对齐问题。在KITTI道路、Cityscapes与自建厦门市道路数据集上对该方法进行了测试和评估,结果表明,该方法在保证较高效率的同时,具有较高的稳定性和精确性,其最大F值在KITTI和Cityscapes数据集上分别达到97.53%和98.54%,优于目前KITTI道路基准测试的所有方法。此方法可为汽车驾驶辅助系统的路径规划与轨迹预测等任务提供高精度的语义先验信息。

    Abstract:
    Objectives 

    The freespace detection is a crucial foundation for scene perception in advanced driver assistance systems. Convolutional neural network-based methods are unable to build global contextual infortmation that generate voids and interruptions in predicted results. At the same time, Transformer-based methods lack local understanding resulting in boundary misalignment and exceed.

    Methods 

    To this end, we propose a pyramid Transformer architecture with learnable deep position encoding for road freespace detection. First, the pyramid Transformer backbone is designed to extract road features from global perspectives. Second, local window attention is employed in dual-Transformer blocks to compensate for detail loss. Finally, to address the problem that traditional unlearnable position encoding ignores the spatial correlation between pixels and the real world, a learnable position encoding from deep convolutional features is constructed to solve the attention and semantic misalignment.

    Results 

    This model is tested and evaluated on KITTI road, Cityscapes, and Xiamen road datasets. The results show that our method achieves maximum F measure of 97.53% and 98.54% in KITTI and Cityscapes, respectively.

    Conclusions 

    Our method outperforms existing algorithms in the KITTI road benchmark by ensuring higher efficiency while providing higher stability and accuracy. Meanwhile, our method provides high-precision semantic prior information for tasks such as path planning and trajectory prediction in automotive driving assistance systems.

  • 目前,北斗导航卫星系统(BDS)已实现局域覆盖,随着系统建设的不断完善和应用的不断拓展,与之相关的各类数据处理软件的开发成为重要的研究内容。因此,自主开发北斗高精度数据处理软件,成为发展高精度位置服务的迫切任务[1-8]。因北斗导航卫星系统与GPS在星座构造、坐标框架、时间系统、信号频率等方面具有明显差异[9-15],现有的高精度GPS数据处理软件无法直接处理北斗数据。本文针对北斗高精度数据处理的系统设计、数据流、功能模块及高精度算法实现等进行了研究,研制开发了一套高精度北斗基线解算软件BGO(BeiDou Navigation Satellite System/Global Positioning System Office),并将其用于高速铁路高精度控制测量建网。通过与商业软件TGO(Trimble Geomatics Office)和TBC(Trimble Business Center),及高精度科研软件Bernese进行对比测试、性能分析,验证了该软件的正确性和有效性。

    北斗和GPS基线解算软件主要包含北斗基线处理、GPS基线处理及联合基线处理3大模块。各模块间相互独立,但使用相同的数据结构,且数据流基本一致。数据处理流程如图 1所示。

    图  1  BGO软件数据流
    Figure  1.  Data Stream of BGO Software

    基线解算之前,需选择有效双频观测数据,具体包含低高度角卫星剔除、观测值粗差剔除、星历未获取观测数据剔除等。剔除质量较差的观测数据可通过可视化的方式实现。通过双频数据组合有效消除电离层延迟影响,伪距消电离组合能算出测站精确至10 m内的概略位置,从而形成网络拓扑图,便于用户查看站点的平面分布。基线解算时,北斗与GPS独立系统数据处理算法相同;联合处理需选择统一的坐标和时间框架,随着多余观测数的增加,还需设置合理的模糊度固定限值。基线解算后,进行网平差,应剔除不合格基线,直至平差结果满足要求。

    高精度基线解算利用双差观测量建立误差方程,北斗双差观测量构造如式(1):

    $$ \mathit{\Delta} \nabla L^{{C_m}{C_n}}_{{S_i}{S_j}} = \left( {L^{{C_n}}_{{S_j}} - L^{{C_n}}_{{S_i}}} \right) - \left( {L^{{C_m}}_{{S_j}} - L^{{C_m}}_{{S_i}}} \right) $$ (1)

    式中,ΔL表示双差观测量;SiSj表示任意站点;CmCn表示任意北斗卫星。

    依据式(1)构建的双差观测量,建立误差方程,如式(2):

    $$ \left[ \begin{array}{l} \mathit{\Delta} \nabla \boldsymbol{\varPhi} \\ \mathit{\Delta} \nabla \boldsymbol{P} \end{array} \right] = \boldsymbol{BX} + \boldsymbol{A}\mathit{\Delta} \nabla \boldsymbol{N} + \boldsymbol{V} $$ (2)

    式中,ΔΦΔP分别表示卫星载波相位和伪距双差观测量;X表示基线向量;ΔN表示双差整周模糊度;BA为系数阵;V为残差向量。

    利用式(2)构建的误差方程,解算基线向量和双差整周模糊度浮点解。利用LAMBAD方法[16, 17]固定双差整周模糊度后去除。再利用载波相位观测值获取高精度基线向量结果。基线解算过程中,主要利用抗差估计的切比雪夫多项式拟合法[18]及MW-GF组合法[19]探测与修复周跳。

    对北斗和GPS双系统基线解算,只需将各系统的双差观测量误差方程叠加后平差计算,即可实现双系统联合基线解算。但需注意,星间差分需选择同一系统卫星,否则会引入系统间信号硬件延迟[20],影响双差整周模糊度的固定。另外,北斗和GPS在时间框架、坐标框架等存在一定差异,双系统联合解算需保证框架的统一。

    北斗和GPS时间转换公式如式(3):

    $$ {t_C} = {t_G}-14\;{\rm{s}} $$ (3)

    式中,tCtG分别表示北斗时和GPS时,两者均为原子时,起算原点不同[13]

    北斗和GPS坐标转换公式如式(4):

    $$ \begin{array}{c} \left[ {\begin{array}{*{20}{c}} {{X_C}}\\ {{Y_C}}\\ {{Z_C}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {{X_G}}\\ {{Y_G}}\\ {{Z_G}} \end{array}} \right] + \left[ {\begin{array}{*{20}{c}} {{T_X}}\\ {{T_Y}}\\ {{T_Z}} \end{array}} \right] + \\ \left[ {\begin{array}{*{20}{c}} D&{ - {R_Z}}&{{R_Y}}\\ {{R_Z}}&D&{ - {R_X}}\\ { - {R_Y}}&{{R_X}}&D \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {{X_G}}\\ {{Y_G}}\\ {{Z_G}} \end{array}} \right] \end{array} $$ (4)

    式中,北斗坐标(XCYCZC)与GPS坐标(XGYGZG)可通过七参数TXTYTZDRXRYRZ进行转换。北斗CGCS2000坐标系采用ITRF97框架2000历元的坐标和速度场,当前GPS WGS84坐标和ITRF08基本一致。因此,可利用ITRF97框架2000历元与ITRF08间转换的七参数(ITRF网站公布)实现北斗与GPS坐标框架的统一[11, 12]

    处理高速铁路CPI控制网时,通过读取观测文件和星历文件,单点定位生成控制网的基线网络拓扑图,如图 2所示。基线解算前,设置相关参数包括卫星截止高度角、误差限差参数、框架、对流层模型、电离层模型、模糊度Ratio值、同步最小观测历元数等。设置完成后,可选择北斗、GPS、联合3种模式进行基线解算。基线解算完成后,软件界面中将显示解算的基线分量及其精度,并可显示残差向量检核基线解算效果。

    图  2  BGO软件主界面
    Figure  2.  Software View of BGO

    为了测试BGO解算GPS基线的正确性,将其与TGO和Bernese软件处理结果进行了比较,得到57条GPS基线(基线最长6 667 m,最短446 m)的比较结果,如图 3所示。

    图  3  BGO、TGO、Bernese软件处理GPS基线分量比较
    Figure  3.  Comparing GPS Baseline Components from BGO, TGO and Bernese Software

    图 3(a)3(b)分别表示BGO软件与TGO、Bernese软件处理GPS基线分量的差值ΔX、ΔY、ΔZ图 3(a)中,BGO和TGO有52条基线在XYZ方向的分量差值均在2 cm内,有48条基线各分量差值在mm级。TGO解算少量基线验后方差分量超限,与BGO基线分量差值较大。图 3(b)中,BGO和Bernese有55条基线在XYZ方向的分量差值均在2 cm内,有49条基线各分量差值在mm级。

    图 4(a)~4(c)分别表示BGO、TGO、Bernese软件处理GPS基线的内符合精度σXσYσZ(BGO、TGO、Bernese软件基线解算精度分别精确至0.1 mm、1 mm和0.1 mm)。整体上,约90%的基线3个软件的解算精度相当。

    图  4  BGO、TGO、Bernese的GPS基线内符合精度比较
    Figure  4.  Comparing GPS Baseline Precision from BGO, TGO and Bernese Software

    为了测试BGO解算北斗与GPS联合基线的性能,本文选用美国Trimble的商业软件TBC与之进行比较。同上57条基线,每条基线观测数据均包含北斗与GPS观测数据。图 5展示了BGO和TBC处理北斗与GPS联合基线分量的差值ΔX、ΔY、ΔZ图 5可见,98%的基线分量差值分布在mm级,表明BGO软件处理联合基线能达到与TBC软件相当的水平。另外,两者内符合精度绝大部分均在mm级,故图 5中未加以比较。

    图  5  BGO与TBC软件处理北斗与GPS联合基线分量比较
    Figure  5.  Comparing BDS and GPS Combined Baseline Components from BGO and TBC Software

    由此可知,BGO软件处理GPS基线、北斗与GPS联合基线的内外符合精度能达到TGO、Bernese、TBC相当的水平。因此,以BGO软件处理GPS、北斗与GPS联合基线结果为参考值,分析该软件处理北斗基线结果的正确性和可靠性,如图 6图 7所示。图 6比较了北斗与GPS、联合基线分量的差值,图 7比较了北斗、GPS、联合基线解算的内符合精度。

    图  6  BGO软件处理北斗与GPS、联合基线分量比较
    Figure  6.  Comparing BDS, GPS and BDS/GPS Combined Baseline Components from BGO Software
    图  7  北斗、GPS、联合基线解的内符合精度统计
    Figure  7.  The Statistics of Precision of BDS, GPS and BDS/GPS Combined Baseline Solutions

    图 6(a)表示BGO软件处理北斗与GPS基线分量的差值ΔXΔYΔZ,其中有43条基线在XYZ方向上的分量差值ΔxΔyΔz在2 cm内,有31条基线在XYZ方向上的分量差值在mm级。图 6(b)表示BGO软件处理北斗与联合基线分量的差值,其中有54条基线在XYZ方向上的分量差值在2 cm内,有38条基线在XYZ方向上的分量差值在mm级(图 6中第6条基线北斗为浮点解,各分量差值结果较大,图中置为0)。

    图 7中,93%的联合基线在XYZ方向上的分量精度分别优于0.5 mm、1 mm、0.5 mm;约90%的北斗基线和95%的GPS基线在XYZ方向上的分量精度分别优于1 mm、2 mm、1 mm。由北斗、GPS、联合基线3者精度比较可知,在北斗试运行阶段,GPS基线内符合精度略优于北斗,北斗与GPS联合系统基线内符合精度明显高于独立系统。

    BGO具备网平差功能,根据网平差后的基线分量改正数、相对中误差、点位精度等判断基线解算结果的可靠性。对上述解算的北斗、GPS、联合基线分别进行无约束网平差。

    北斗、GPS、联合基线无约束网平差的平差改正数δXδYδZ绝大部分在±1 cm内,如图 8(a)~8(c)所示。最弱边相对中误差优于5.5 ppm(规范限值),具体见表 1。据图 8表 1及《高速铁路工程测量规范》[21]可知,BGO能合理稳定地解算北斗、GPS及联合基线,解算结果中的基线向量改正数、最弱边相对中误差、最弱点点位精度均满足CPI控制测量要求,各系统解算均能精确获得24个CPI控制点坐标。

    图  8  GPS、北斗、联合无约束网平差基线向量改正数
    Figure  8.  Baseline Vector Corrections from GPS, BDS and BDS/GPS Combined Unconstrained Adjustment
    表  1  GPS、北斗、联合无约束平差结果统计
    Table  1.  The Statistics of GPS, BDS and BDS/GPS Combined Unconstrained Adjustment Results
    解算模式 独立基线 多余观测数 控制点个数 最弱边相对中误差/ppm 最弱点点位精度/mm
    GPS 55 66 24 3.6 23.6
    北斗 51 57 24 3.1 26.9
    联合 57 72 24 3.7 17.9
    下载: 导出CSV 
    | 显示表格

    本文系统地研究了北斗与GPS联合基线解算的算法,自主开发了北斗高精度基线解算软件BGO。通过实测高铁CPI控制网的数据处理测试表明:软件能进行高精度地处理北斗与GPS数据, 以及北斗与GPS联合数据处理;GPS基线解算性能与天宝TGO软件相当,能达到与Bernese软件一致的精度;北斗与GPS基线处理能达到与TBC相当的水平。BGO最大的优势在于能对北斗和GPS进行联合解算,从而提高北斗或GPS单系统的基线解算合格率和精度。经高速铁路CPI控制网实例测试,证明该软件处理基线结果可用于高精度北斗和GPS测量控制网的数据处理。

    http://ch.whu.edu.cn/cn/article/doi/10.13203/j.whugis20230252

  • 图  1   本文模型架构示意图

    Figure  1.   Architecture Diagram of the Proposed Method

    图  2   双分支Transformer模块示意图

    Figure  2.   Diagram of Dual-Transformer Blocks

    图  3   KITTI道路数据集不同方法的可行驶区域检测结果比较

    Figure  3.   Comparison of Different Methods in KITTI Road Dataset

    图  4   厦门市道路数据集测试结果可视化

    Figure  4.   Results Visulazition in Xiamen City Road Dataset

    图  5   KITTI道路数据集上运行时间与最大F值对比

    Figure  5.   Comparison of Runtime and Max F in KITTI Road Dataset

    图  6   定性评估的注意力映射图

    Figure  6.   Attention Representation of Qualitative Evaluation

    图  7   道路边缘轮廓细节可视化结果

    Figure  7.   Road Boundary Details Visualization Results

    图  8   不同场景下与不同Transformer方法的定性比较

    Figure  8.   Qualitative Comparison of Different Transformer-Based Methods on Various Scenes

    表  1   KITTI 3种道路场景验证结果/%

    Table  1   Test Results of 3 KITTI Road Scenes/%

    道路类别最大F平均精度精确度召回率
    UM_Road97.2892.6697.3797.19
    UMM_Road98.0994.7497.7498.45
    UU_Road96.8591.4296.7396.96
    下载: 导出CSV

    表  2   KITTI道路中的综合评估结果

    Table  2   Comprehensive Comparison of KITTI Road

    方法最大F值/%平均精度/%精确度/%召回率/%假阳性率/%假阴性率/%推理时间/s
    LidCamNet[39]96.0393.9396.2395.832.074.170.15
    LC-CRF[40]95.6888.3493.6297.333.672.670.18
    TVFNet[19]95.3490.2695.7394.942.335.060.04
    RGB36-Cotrain[41]95.5593.7195.6895.422.374.580.10
    HID-LS[42]93.1187.3392.5293.714.186.290.25
    RoadNet3[43]94.4493.4594.6994.182.915.820.30
    OFANet[44]93.7485.3790.3697.385.722.620.04
    ALO-AVG-MM[45]92.0385.6490.6593.455.316.550.03
    RBANet[31]96.3089.7295.1497.502.752.500.16
    PLARD[21]97.0394.0397.1996.881.543.120.16
    SNE-RoadSeg[24]96.7594.0796.9096.611.703.390.18
    RoadNetRT[29]92.5593.2192.9492.163.867.840.08
    NIM-RTFNet[23]96.0294.0196.4395.621.954.380.05
    Hadamard-FCN[46]94.8591.4894.8194.892.855.110.02
    BJN[15]94.8990.6396.1493.672.076.330.02
    HA-DeepLabv3+[20]94.8393.2494.7794.892.885.110.06
    CLCFNet[47]96.3890.8596.3896.391.993.610.02
    DFM-RTFNet[25]94.7894.0596.6296.931.873.070.08
    SNE-RoadSeg+[26]97.5093.9897.4197.581.434.240.08
    USNet[38]96.8993.2596.5197.271.942.730.02
    HEAT[48]97.0093.0996.5397.481.932.510.08
    本文方法97.5392.9797.3297.741.482.260.08
    下载: 导出CSV

    表  3   Cityscapes数据集中的检测结果/%

    Table  3   Test Results in Cityscapes Dataset/%

    方法最大F精确度召回率
    FCN[49]94.6893.6995.70
    SegNet[50]95.8194.5597.11
    RBANet[31]98.0097.8798.12
    USNet[38]98.2798.2698.28
    本文方法98.5498.3598.73
    下载: 导出CSV

    表  4   各模块对网络整体性能的影响/%

    Table  4   Performance Impacts of Different Modules on Whole Network/%

    原始Transformer金字塔Transformer深度位置编码双分支Transformer模块最大F平均精度精确度召回率假阳性率假阴性率
    83.6987.5981.3486.1810.8913.82
    87.0090.5185.7788.278.0711.73
    94.5393.6894.6294.452.965.55
    97.5393.9797.3297.741.481.26
    下载: 导出CSV

    表  5   不同Transformer方法的定量比较/%

    Table  5   Quantitative Comparison of Different Transformer-Based Methods/%

    方法最大F平均精度精确度召回率
    PVT[32]87.0090.5185.7788.27
    SegFormer[33]91.6792.4789.8593.56
    CMX[51]94.5593.4194.4494.66
    本文方法97.5393.9797.3297.74
    下载: 导出CSV
  • [1] 崔明阳, 黄荷叶, 许庆, 等. 智能网联汽车架构、功能与应用关键技术[J]. 清华大学学报(自然科学版), 2022, 62(3): 493-508.

    Cui Mingyang, Huang Heye, Xu Qing, et al. Survey of Intelligent and Connected Vehicle Technologies: Architectures, Functions and Applications[J]. Journal of Tsinghua University (Science and Technology), 2022, 62(3): 493-508.

    [2]

    Zhang Yanjie, Huang Wei, Liu Xintao, et al. An Approach for High Definition (HD) Maps Information Interaction for Autonomous Driving[J]. Geomatics and Information Science of Wuhan University,2023,DOI: 10.13203/j.whugis20230166.(张焱杰, 黄炜, 刘信陶, 等. 自动驾驶高精地图信息交互方法[J]. 武汉大学学报(信息科学版),2023,DOI: 10.13203/j.whugis20230166.) doi: 10.13203/j.whugis20230166

    [3]

    Ying Shen, Jiang Yuewen, Gu Jiangyan, et al. High Definition Map Model for Autonomous Driving and Key Technologies[J]. Geomatics and Information Science of Wuhan University,2023,DOI: 10.13203/j.whugis20230227. (应申, 蒋跃文, 顾江岩, 等. 面向自动驾驶的高精地图模型及关键技术[J]. 武汉大学学报(信息科学版),2023,DOI: 10.13203/j.whugis20230227.) doi: 10.13203/j.whugis20230227

    [4]

    Daoud M A, Mehrez M W, Rayside D, et al. Simultaneous Feasible Local Planning and Path-Following Control for Autonomous Driving[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(9): 16358-16370.

    [5]

    Pan J C, Sun H Y, Xu K C, et al. Lane-Attention: Predicting Vehicles’ Moving Trajectories by Learning Their Attention over Lanes[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems,Las Vegas, USA, 2020.

    [6]

    Weber M, Xie J, Collins M D, et al. STEP: Segmenting and Tracking Every Pixel[C]//The 35th Conference on Neural Information Processing Systems Datasets and Benchmarks Track, New Orleans, USA, 2021.

    [7]

    Shinzato P Y, Wolf D F. A Road Following Approach Using Artificial Neural Networks Combinations[J]. Journal of Intelligent & Robotic Systems, 2011, 62(3): 527-546.

    [8]

    Alvarez J M, Gevers T, LeCun Y, et al. Road Scene Segmentation from a Single Image[C]//The 12th European Conference on Computer Vision: Volume Part VII, Florence, Italy, 2012.

    [9]

    Passani M, Yebes J J, Bergasa L M. CRF-Based Semantic Labeling in Miniaturized Road Scenes[C]//The 17th International IEEE Conference on Intelligent Transportation Systems, Qingdao, China, 2014.

    [10]

    Passani M, Yebes J J, Bergasa L M. Fast Pixelwise Road Inference Based on Uniformly Reweighted Belief Propagation[C]//IEEE Intelligent Vehicles Symposium, Seoul, 2015.

    [11]

    Vitor G B, Victorino A, Ferreira J V. A Probabilistic Distribution Approach for the Classification of Urban Roads in Complex Environments[C]//IEEE Workshop on International Conference on Robotics and Automation, Hong Kong, China, 2014.

    [12]

    Munoz D, Bagnell J A, Hebert M. Stacked Hierarchical Labeling[C]//The 11th European Conference on Computer Vision: Part VI, Heraklion, Crete, Greece, 2010.

    [13]

    Mendes C C T, Frémont V, Wolf D F. Exploiting Fully Convolutional Neural Networks for Fast Road Detection[C]//IEEE International Conference on Robotics and Automation, Stockholm, Sweden, 2016.

    [14]

    Muñoz-Bulnes J, Fernandez C, Parra I, et al. Deep Fully Convolutional Networks with Random Data Augmentation for Enhanced Generalization in Road Detection[C]//The 20th International Conference on Intelligent Transportation Systems, Yokohama, Japan, 2017.

    [15] 车满强, 李树斌, 李铭. 基于HarDNet全卷积网络的道路路面语义分割方法[J]. 计算机应用, 2021, 41(S2): 76-80.

    Che Manqiang, Li Shubin, Li Ming. Road Surface Semantic Segmentation Method Based on HarDNet Fully Convolutional Network[J]. Journal of Computer Applications, 2021, 41(S2): 76-80.

    [16] 蒋腾平, 杨必胜, 周雨舟, 等. 道路点云场景双层卷积语义分割[J]. 武汉大学学报(信息科学版), 2020, 45(12): 1942-1948.

    Jiang Tengping, Yang Bisheng, Zhou Yuzhou, et al. Bilevel Convolutional Neural Networks for 3D Semantic Segmentation Using Large-Scale LiDAR Point Clouds in Complex Environments[J]. Geomatics and Information Science of Wuhan University, 2020, 45(12): 1942-1948.

    [17]

    Yu B, Lee D, Lee J S, et al. Free Space Detection Using Camera-LiDAR Fusion in a Bird's Eye View Plane[J]. Sensors, 2021, 21(22): 7623.

    [18]

    Chen L, Yang J, Kong H. LiDAR-Histogram for Fast Road and Obstacle Detection[C]//IEEE International Conference on Robotics and Automation, Singapore, 2017.

    [19]

    Gu S, Zhang Y G, Yang J, et al. Two-View Fusion Based Convolutional Neural Network for Urban Road Detection[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems, Macau, China, 2019.

    [20]

    Fan R, Wang H L, Cai P D, et al. Learning Collision-Free Space Detection from Stereo Images: Homography Matrix Brings Better Data Augmentation[J]. IEEE/ASME Transactions on Mechatronics, 2022, 27(1): 225-233.

    [21]

    Chen Z, Zhang J, Tao D C. Progressive LiDAR Adaptation for Road Detection[J]. IEEE/CAA Journal of Automatica Sinica,2019,6(3): 693-702.

    [22]

    Khan A A, Shao J, Rao Y B, et al. LRDNet: Lightweight LiDAR Aided Cascaded Feature Pools for Free Road Space Detection[J]. IEEE Transactions on Multimedia, 2022, 99: 1-13.

    [23]

    Wang H L, Fan R, Sun Y X, et al. Applying Surface Normal Information in Drivable Area and Road Anomaly Detection for Ground Mobile Robots[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems, Las Vegas, USA, 2020.

    [24]

    Fan R, Wang H L, Cai P D, et al. SNE-RoadSeg: Incorporating Surface Normal Information into Semantic Segmentation for Accurate Freespace Detection[C]//The 16th European Conference, Glasgow, UK, 2020.

    [25]

    Wang H L, Fan R, Sun Y X, et al. Dynamic Fusion Module Evolves Drivable Area and Road Anomaly Detection: A Benchmark and Algorithms[J]. IEEE Transactions on Cybernetics, 2022, 52(10): 10750-10760.

    [26]

    Wang H L, Fan R, Cai P D, et al. SNE-RoadSeg: Rethinking Depth-Normal Translation and Deep Supervision for Freespace Detection[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems, Prague, 2021.

    [27] 宋爽, 陈驰, 杨必胜, 等. 低成本大视场深度相机阵列系统[J]. 武汉大学学报(信息科学版), 2018, 43(9): 1391-1398.

    Song Shuang, Chen Chi, Yang Bisheng, et al. Large Field of View Array System Using Low Cost RGB-D Camerasin[J]. Geomatics and Information Science of Wuhan University, 2018, 43(9): 1391-1398.

    [28] 孟怡悦, 郭迟, 刘经南. 基于注意力机制和奖励塑造的深度强化学习视觉目标导航方法[J]. 武汉大学学报(信息科学版), 2023, DOI: 10.13203/j.whugis20230193. doi: 10.13203/j.whugis20230193

    Meng Yiyue, Guo Chi, Liu Jingnan. Deep Reinforcement Learning Visual Target Navigation Method Based on Attention Mechanism and Reward Shaping[J]. Geomatics and Information Science of Wuhan University,2023,DOI:10.13203/j.whugis20230193. doi: 10.13203/j.whugis20230193

    [29]

    Bai L, Lyu Y C, Huang X M. RoadNet-RT: High Throughput CNN Architecture and SoC Design for Real-Time Road Segmentation[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2021, 68(2): 704-714.

    [30] 艾青林, 张俊瑞, 吴飞青. 基于小目标类别注意力机制与特征融合的AF-ICNet非结构化场景语义分割方法[J]. 光子学报, 2023, 52(1): 0110001.

    Ai Qinglin, Zhang Junrui, Wu Feiqing. AF-ICNet Semantic Segmentation Method for Unstructured Scenes Based on Small Target Category Attention Mechanism and Feature Fusion[J]. Acta Photonica Sinica, 2023, 52(1): 0110001.

    [31]

    Sun J Y, Kim S W, Lee S W, et al. Reverse and Boundary Attention Network for Road Segmentation[C]//IEEE/CVF International Conference on Computer Vision Workshop , Seoul, 2019.

    [32]

    Wang W H, Xie E Z, Li X, et al. Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction Without Convolutions[C]//IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021.

    [33]

    Xie E, Wang W, Yu Z, et al. SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers[J]. Advances in Neural Information Processing Systems, 2021, 34: 12077-12090.

    [34]

    Liu Z, Lin Y T, Cao Y, et al. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows[C]//IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021.

    [35]

    Fritsch J, Kühnl T, Geiger A. A New Performance Measure and Evaluation Benchmark for Road Detection Algorithms[C]//The 16th International Conferen‑ce on Intelligent Transportation Systems, The Hague, Netherlands, 2013.

    [36]

    Geiger A, Lenz P, Stiller C, et al. Vision Meets Robotics: The KITTI Dataset[J].International Journal of Robotics Research, 2013,32(11):1231-1237.

    [37]

    Cordts M,Omran M,Ramos S,et al.The Cityscapes Dataset for Semantic Urban Scene Understanding[C]//IEEE Conference on Computer Vision and Pattern Recognition,Las Vegas,USA, 2016.

    [38]

    Chang Y C, Xue F, Sheng F, et al. Fast Road Segmentation via Uncertainty-Aware Symmetric Network[C]//International Conference on Robotics and Automation, Philadelphia, USA, 2022.

    [39]

    Caltagirone L, Bellone M, Svensson L, et al. LiDAR-Camera Fusion for Road Detection Using Fully Convolutional Neural Networks[J]. Robotics and Autonomous Systems, 2019, 111: 125-131.

    [40]

    Gu S, Zhang Y, Tang J, et al. Road Detection Through CRF Based LiDAR-Camera Fusion[C]//2019 International Conference on Robotics and Automation, Montreal, Canada, 2019.

    [41]

    Han Z, Zhang C, Fu H, et al. Trusted Multi-view Classification[C]//International Conference on Learning Representations, New York, USA, 2020.

    [42]

    Gu S, Zhang Y G, Yuan X, et al. Histograms of the Normalized Inverse Depth and Line Scanning for Urban Road Detection[J]. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(8): 3070-3080.

    [43]

    Lyu Y C, Bai L, Huang X M. Road Segmentation Using CNN and Distributed LSTM[C]//IEEE International Symposium on Circuits and Systems , Sapporo, Japan, 2019.

    [44]

    Zhang S C, Zhang Z, Sun L B, et al. One for All: A Mutual Enhancement Method for Object Detection and Semantic Segmentation[J].Applied Sciences, 2019, 10(1): 13.

    [45]

    Reis F A L, Almeida R, Kijak E, et al. Combining Convolutional Side-Outputs for Road Image Segmentation[C]//International Joint Conference on Neural Networks, Budapest, Hungary, 2019.

    [46]

    Oeljeklaus M. An Integrated Approach for Traffic Scene Understanding from Monocular Cameras[M]. Düsseldorf: VDI Verlag, 2021.

    [47]

    Gu S,Yang J,Kong H.A Cascaded LiDAR-Camera Fusion Network for Road Detection[C]//IEEE International Conference on Robotics and Automation, Xi’an, China, 2021.

    [48]

    Han T, Li C M, Chen S Y, et al. HEAT: Incorporating Hierarchical Enhanced Attention Transformation into Urban Road Detection[J]. IET Intelligent Transport Systems, 2023(1): 1–20.

    [49]

    Shelhamer E, Long J, Darrell T. Fully Convolutional Networks for Semantic Segmentation[C]// IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015.

    [50]

    Badrinarayanan V, Kendall A, Cipolla R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.

    [51]

    Zhang J M, Liu H Y, Yang K L, et al. CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(12): 14679-14694.

  • 期刊类型引用(8)

    1. 张莹,任战利,兰华平,祁凯,邢光远,夏岩. 关中盆地新近系蓝田-灞河组热储层物性及渗流特征研究. 地质通报. 2024(05): 712-725 . 百度学术
    2. 吴陈冰洁,罗璐,高楠安,汪新伟,崔梓贤. 关中盆地西安凹陷新近系砂岩热储特征研究. 现代地质. 2024(06): 1571-1584 . 百度学术
    3. 张欢,陈应涛,陶威,陈涛,余文鑫,艾卉卉. 不同拉伸方式和速度下的伸展构造砂箱物理模拟实验研究. 西北地质. 2023(02): 327-336 . 百度学术
    4. 颜复康,田镇,杨志强,杨兵,梁沛. 厄瓜多尔俯冲区震间闭锁与粘弹性变形研究. 大地测量与地球动力学. 2023(10): 1080-1085 . 百度学术
    5. 张莹,任战利,邢光远,祁凯,夏岩. 渭河盆地新近系热储层特征. 地质通报. 2023(11): 1993-2005 . 百度学术
    6. 徐斌,张艳. 地下水化学类型分区的GIS空间分析模型. 武汉大学学报(信息科学版). 2019(06): 866-874 . 百度学术
    7. 闫俊义,吕睿,赵涛,王莹,白若冰,古云鹤. 关中盆地地壳应力场特征分析. 山西地震. 2019(03): 39-41 . 百度学术
    8. 白相东,关成尧,张艳,袁四化,刘晓燕. 渭河盆地断层系统运动学体制分解与探讨. 防灾科技学院学报. 2018(03): 8-16 . 百度学术

    其他类型引用(10)

图(8)  /  表(5)
计量
  • 文章访问数:  497
  • HTML全文浏览量:  80
  • PDF下载量:  96
  • 被引次数: 18
出版历程
  • 收稿日期:  2023-07-13
  • 网络出版日期:  2023-11-01
  • 刊出日期:  2024-04-04

目录

/

返回文章
返回