HAN Ting, CHEN Siyu, MA Jin, CAI Guorong, ZHANG Wuming, CHEN Yiping. Road Image Free Space Detection via Learnable Deep Position Encoding[J]. Geomatics and Information Science of Wuhan University, 2024, 49(4): 582-594. DOI: 10.13203/j.whugis20230252
Citation: HAN Ting, CHEN Siyu, MA Jin, CAI Guorong, ZHANG Wuming, CHEN Yiping. Road Image Free Space Detection via Learnable Deep Position Encoding[J]. Geomatics and Information Science of Wuhan University, 2024, 49(4): 582-594. DOI: 10.13203/j.whugis20230252

Road Image Free Space Detection via Learnable Deep Position Encoding

More Information
  • Received Date: July 13, 2023
  • Available Online: November 01, 2023
  • Objectives 

    The freespace detection is a crucial foundation for scene perception in advanced driver assistance systems. Convolutional neural network-based methods are unable to build global contextual infortmation that generate voids and interruptions in predicted results. At the same time, Transformer-based methods lack local understanding resulting in boundary misalignment and exceed.

    Methods 

    To this end, we propose a pyramid Transformer architecture with learnable deep position encoding for road freespace detection. First, the pyramid Transformer backbone is designed to extract road features from global perspectives. Second, local window attention is employed in dual-Transformer blocks to compensate for detail loss. Finally, to address the problem that traditional unlearnable position encoding ignores the spatial correlation between pixels and the real world, a learnable position encoding from deep convolutional features is constructed to solve the attention and semantic misalignment.

    Results 

    This model is tested and evaluated on KITTI road, Cityscapes, and Xiamen road datasets. The results show that our method achieves maximum F measure of 97.53% and 98.54% in KITTI and Cityscapes, respectively.

    Conclusions 

    Our method outperforms existing algorithms in the KITTI road benchmark by ensuring higher efficiency while providing higher stability and accuracy. Meanwhile, our method provides high-precision semantic prior information for tasks such as path planning and trajectory prediction in automotive driving assistance systems.

  • [1]
    崔明阳, 黄荷叶, 许庆, 等. 智能网联汽车架构、功能与应用关键技术[J]. 清华大学学报(自然科学版), 2022, 62(3): 493-508.

    Cui Mingyang, Huang Heye, Xu Qing, et al. Survey of Intelligent and Connected Vehicle Technologies: Architectures, Functions and Applications[J]. Journal of Tsinghua University (Science and Technology), 2022, 62(3): 493-508.
    [2]
    Zhang Yanjie, Huang Wei, Liu Xintao, et al. An Approach for High Definition (HD) Maps Information Interaction for Autonomous Driving[J]. Geomatics and Information Science of Wuhan University,2023,DOI: 10.13203/j.whugis20230166.(张焱杰, 黄炜, 刘信陶, 等. 自动驾驶高精地图信息交互方法[J]. 武汉大学学报(信息科学版),2023,DOI: 10.13203/j.whugis20230166.) doi: 10.13203/j.whugis20230166
    [3]
    Ying Shen, Jiang Yuewen, Gu Jiangyan, et al. High Definition Map Model for Autonomous Driving and Key Technologies[J]. Geomatics and Information Science of Wuhan University,2023,DOI: 10.13203/j.whugis20230227. (应申, 蒋跃文, 顾江岩, 等. 面向自动驾驶的高精地图模型及关键技术[J]. 武汉大学学报(信息科学版),2023,DOI: 10.13203/j.whugis20230227.) doi: 10.13203/j.whugis20230227
    [4]
    Daoud M A, Mehrez M W, Rayside D, et al. Simultaneous Feasible Local Planning and Path-Following Control for Autonomous Driving[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(9): 16358-16370.
    [5]
    Pan J C, Sun H Y, Xu K C, et al. Lane-Attention: Predicting Vehicles’ Moving Trajectories by Learning Their Attention over Lanes[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems,Las Vegas, USA, 2020.
    [6]
    Weber M, Xie J, Collins M D, et al. STEP: Segmenting and Tracking Every Pixel[C]//The 35th Conference on Neural Information Processing Systems Datasets and Benchmarks Track, New Orleans, USA, 2021.
    [7]
    Shinzato P Y, Wolf D F. A Road Following Approach Using Artificial Neural Networks Combinations[J]. Journal of Intelligent & Robotic Systems, 2011, 62(3): 527-546.
    [8]
    Alvarez J M, Gevers T, LeCun Y, et al. Road Scene Segmentation from a Single Image[C]//The 12th European Conference on Computer Vision: Volume Part VII, Florence, Italy, 2012.
    [9]
    Passani M, Yebes J J, Bergasa L M. CRF-Based Semantic Labeling in Miniaturized Road Scenes[C]//The 17th International IEEE Conference on Intelligent Transportation Systems, Qingdao, China, 2014.
    [10]
    Passani M, Yebes J J, Bergasa L M. Fast Pixelwise Road Inference Based on Uniformly Reweighted Belief Propagation[C]//IEEE Intelligent Vehicles Symposium, Seoul, 2015.
    [11]
    Vitor G B, Victorino A, Ferreira J V. A Probabilistic Distribution Approach for the Classification of Urban Roads in Complex Environments[C]//IEEE Workshop on International Conference on Robotics and Automation, Hong Kong, China, 2014.
    [12]
    Munoz D, Bagnell J A, Hebert M. Stacked Hierarchical Labeling[C]//The 11th European Conference on Computer Vision: Part VI, Heraklion, Crete, Greece, 2010.
    [13]
    Mendes C C T, Frémont V, Wolf D F. Exploiting Fully Convolutional Neural Networks for Fast Road Detection[C]//IEEE International Conference on Robotics and Automation, Stockholm, Sweden, 2016.
    [14]
    Muñoz-Bulnes J, Fernandez C, Parra I, et al. Deep Fully Convolutional Networks with Random Data Augmentation for Enhanced Generalization in Road Detection[C]//The 20th International Conference on Intelligent Transportation Systems, Yokohama, Japan, 2017.
    [15]
    车满强, 李树斌, 李铭. 基于HarDNet全卷积网络的道路路面语义分割方法[J]. 计算机应用, 2021, 41(S2): 76-80.

    Che Manqiang, Li Shubin, Li Ming. Road Surface Semantic Segmentation Method Based on HarDNet Fully Convolutional Network[J]. Journal of Computer Applications, 2021, 41(S2): 76-80.
    [16]
    蒋腾平, 杨必胜, 周雨舟, 等. 道路点云场景双层卷积语义分割[J]. 武汉大学学报(信息科学版), 2020, 45(12): 1942-1948.

    Jiang Tengping, Yang Bisheng, Zhou Yuzhou, et al. Bilevel Convolutional Neural Networks for 3D Semantic Segmentation Using Large-Scale LiDAR Point Clouds in Complex Environments[J]. Geomatics and Information Science of Wuhan University, 2020, 45(12): 1942-1948.
    [17]
    Yu B, Lee D, Lee J S, et al. Free Space Detection Using Camera-LiDAR Fusion in a Bird's Eye View Plane[J]. Sensors, 2021, 21(22): 7623.
    [18]
    Chen L, Yang J, Kong H. LiDAR-Histogram for Fast Road and Obstacle Detection[C]//IEEE International Conference on Robotics and Automation, Singapore, 2017.
    [19]
    Gu S, Zhang Y G, Yang J, et al. Two-View Fusion Based Convolutional Neural Network for Urban Road Detection[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems, Macau, China, 2019.
    [20]
    Fan R, Wang H L, Cai P D, et al. Learning Collision-Free Space Detection from Stereo Images: Homography Matrix Brings Better Data Augmentation[J]. IEEE/ASME Transactions on Mechatronics, 2022, 27(1): 225-233.
    [21]
    Chen Z, Zhang J, Tao D C. Progressive LiDAR Adaptation for Road Detection[J]. IEEE/CAA Journal of Automatica Sinica,2019,6(3): 693-702.
    [22]
    Khan A A, Shao J, Rao Y B, et al. LRDNet: Lightweight LiDAR Aided Cascaded Feature Pools for Free Road Space Detection[J]. IEEE Transactions on Multimedia, 2022, 99: 1-13.
    [23]
    Wang H L, Fan R, Sun Y X, et al. Applying Surface Normal Information in Drivable Area and Road Anomaly Detection for Ground Mobile Robots[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems, Las Vegas, USA, 2020.
    [24]
    Fan R, Wang H L, Cai P D, et al. SNE-RoadSeg: Incorporating Surface Normal Information into Semantic Segmentation for Accurate Freespace Detection[C]//The 16th European Conference, Glasgow, UK, 2020.
    [25]
    Wang H L, Fan R, Sun Y X, et al. Dynamic Fusion Module Evolves Drivable Area and Road Anomaly Detection: A Benchmark and Algorithms[J]. IEEE Transactions on Cybernetics, 2022, 52(10): 10750-10760.
    [26]
    Wang H L, Fan R, Cai P D, et al. SNE-RoadSeg: Rethinking Depth-Normal Translation and Deep Supervision for Freespace Detection[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems, Prague, 2021.
    [27]
    宋爽, 陈驰, 杨必胜, 等. 低成本大视场深度相机阵列系统[J]. 武汉大学学报(信息科学版), 2018, 43(9): 1391-1398.

    Song Shuang, Chen Chi, Yang Bisheng, et al. Large Field of View Array System Using Low Cost RGB-D Camerasin[J]. Geomatics and Information Science of Wuhan University, 2018, 43(9): 1391-1398.
    [28]
    孟怡悦, 郭迟, 刘经南. 基于注意力机制和奖励塑造的深度强化学习视觉目标导航方法[J]. 武汉大学学报(信息科学版), 2023, DOI: 10.13203/j.whugis20230193. doi: 10.13203/j.whugis20230193

    Meng Yiyue, Guo Chi, Liu Jingnan. Deep Reinforcement Learning Visual Target Navigation Method Based on Attention Mechanism and Reward Shaping[J]. Geomatics and Information Science of Wuhan University,2023,DOI:10.13203/j.whugis20230193. doi: 10.13203/j.whugis20230193
    [29]
    Bai L, Lyu Y C, Huang X M. RoadNet-RT: High Throughput CNN Architecture and SoC Design for Real-Time Road Segmentation[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2021, 68(2): 704-714.
    [30]
    艾青林, 张俊瑞, 吴飞青. 基于小目标类别注意力机制与特征融合的AF-ICNet非结构化场景语义分割方法[J]. 光子学报, 2023, 52(1): 0110001.

    Ai Qinglin, Zhang Junrui, Wu Feiqing. AF-ICNet Semantic Segmentation Method for Unstructured Scenes Based on Small Target Category Attention Mechanism and Feature Fusion[J]. Acta Photonica Sinica, 2023, 52(1): 0110001.
    [31]
    Sun J Y, Kim S W, Lee S W, et al. Reverse and Boundary Attention Network for Road Segmentation[C]//IEEE/CVF International Conference on Computer Vision Workshop , Seoul, 2019.
    [32]
    Wang W H, Xie E Z, Li X, et al. Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction Without Convolutions[C]//IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021.
    [33]
    Xie E, Wang W, Yu Z, et al. SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers[J]. Advances in Neural Information Processing Systems, 2021, 34: 12077-12090.
    [34]
    Liu Z, Lin Y T, Cao Y, et al. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows[C]//IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021.
    [35]
    Fritsch J, Kühnl T, Geiger A. A New Performance Measure and Evaluation Benchmark for Road Detection Algorithms[C]//The 16th International Conferen‑ce on Intelligent Transportation Systems, The Hague, Netherlands, 2013.
    [36]
    Geiger A, Lenz P, Stiller C, et al. Vision Meets Robotics: The KITTI Dataset[J].International Journal of Robotics Research, 2013,32(11):1231-1237.
    [37]
    Cordts M,Omran M,Ramos S,et al.The Cityscapes Dataset for Semantic Urban Scene Understanding[C]//IEEE Conference on Computer Vision and Pattern Recognition,Las Vegas,USA, 2016.
    [38]
    Chang Y C, Xue F, Sheng F, et al. Fast Road Segmentation via Uncertainty-Aware Symmetric Network[C]//International Conference on Robotics and Automation, Philadelphia, USA, 2022.
    [39]
    Caltagirone L, Bellone M, Svensson L, et al. LiDAR-Camera Fusion for Road Detection Using Fully Convolutional Neural Networks[J]. Robotics and Autonomous Systems, 2019, 111: 125-131.
    [40]
    Gu S, Zhang Y, Tang J, et al. Road Detection Through CRF Based LiDAR-Camera Fusion[C]//2019 International Conference on Robotics and Automation, Montreal, Canada, 2019.
    [41]
    Han Z, Zhang C, Fu H, et al. Trusted Multi-view Classification[C]//International Conference on Learning Representations, New York, USA, 2020.
    [42]
    Gu S, Zhang Y G, Yuan X, et al. Histograms of the Normalized Inverse Depth and Line Scanning for Urban Road Detection[J]. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(8): 3070-3080.
    [43]
    Lyu Y C, Bai L, Huang X M. Road Segmentation Using CNN and Distributed LSTM[C]//IEEE International Symposium on Circuits and Systems , Sapporo, Japan, 2019.
    [44]
    Zhang S C, Zhang Z, Sun L B, et al. One for All: A Mutual Enhancement Method for Object Detection and Semantic Segmentation[J].Applied Sciences, 2019, 10(1): 13.
    [45]
    Reis F A L, Almeida R, Kijak E, et al. Combining Convolutional Side-Outputs for Road Image Segmentation[C]//International Joint Conference on Neural Networks, Budapest, Hungary, 2019.
    [46]
    Oeljeklaus M. An Integrated Approach for Traffic Scene Understanding from Monocular Cameras[M]. Düsseldorf: VDI Verlag, 2021.
    [47]
    Gu S,Yang J,Kong H.A Cascaded LiDAR-Camera Fusion Network for Road Detection[C]//IEEE International Conference on Robotics and Automation, Xi’an, China, 2021.
    [48]
    Han T, Li C M, Chen S Y, et al. HEAT: Incorporating Hierarchical Enhanced Attention Transformation into Urban Road Detection[J]. IET Intelligent Transport Systems, 2023(1): 1–20.
    [49]
    Shelhamer E, Long J, Darrell T. Fully Convolutional Networks for Semantic Segmentation[C]// IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015.
    [50]
    Badrinarayanan V, Kendall A, Cipolla R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.
    [51]
    Zhang J M, Liu H Y, Yang K L, et al. CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(12): 14679-14694.
  • Related Articles

    [1]HU Deyong, QIAO Kun, WANG Xingling, ZHAO Limin, JI Guohua. Comparison of Three Single-window Algorithms for Retrieving Land-Surface Temperature with Landsat 8 TIRS Data[J]. Geomatics and Information Science of Wuhan University, 2017, 42(7): 869-876. DOI: 10.13203/j.whugis20150164
    [2]FENG Qi, CHENG Xuejun, SHEN Xin, XIAO Xiao, WANG Lihui, ZHANG Wen. Inland Riverine Turbidity Estimation for Hanjiang River with Landsat 8 OLI Imager[J]. Geomatics and Information Science of Wuhan University, 2017, 42(5): 643-647. DOI: 10.13203/j.whugis20141002
    [3]WANG Yuzhuo, LIU Xiuguo, ZHANG Wei. Raster River Networks Extraction Based on Parallel Multiple Flow Direction Algorithms[J]. Geomatics and Information Science of Wuhan University, 2015, 40(12): 1646-1652,1682. DOI: 10.13203/j.whugis20140645
    [4]LI Yuguang, LI Qingquan. A Fast Algorithm for Huge Volume Floating Car Data Map-Matching:A Vector to Raster Map Conversion Approach[J]. Geomatics and Information Science of Wuhan University, 2014, 39(6): 724-728. DOI: 10.13203/j.whugis20140071
    [5]DONG Jian, PENG Rencan, CHEN Yi, LI Ning. An Algorithm for Centre Line Generation Based on Model of Approaching Intersection of Buffering Borderline from Reciprocal Direction[J]. Geomatics and Information Science of Wuhan University, 2011, 36(9): 1120-1123.
    [6]ZHANG Junfeng, FEI Lifan, HUANG Lina, LIU Yining. Real-Time Dynamic Rendering Algorithm of Terrain Using 3D_DP Method and Quad_TIN Model[J]. Geomatics and Information Science of Wuhan University, 2011, 36(3): 346-350.
    [7]LAN Qiuping, FEI Lifan, LIU Yining. An Approach on Calculating Firn Volume Change from Multi-temporal DEMs[J]. Geomatics and Information Science of Wuhan University, 2010, 35(10): 1222-1225.
    [8]HUANG Lina, FEI Lifan. Experimental Investigation on the Three Dimension Generalization of Contour Lines using 3D D-P Algorithm[J]. Geomatics and Information Science of Wuhan University, 2010, 35(1): 55-58.
    [9]YAN Huiwu, ZHU Guorui, XU Zhiyong, GAO Shan. Volume Rendering and 3D Modeling of Hydrogeologic Layer Based on Kriging Algorithm[J]. Geomatics and Information Science of Wuhan University, 2004, 29(7): 611-614.
    [10]CHENG Penggen, GONG Jianya, SHI Wenzhong, LIU Shaohua. Geological Object Modeling Based on Quasi Tri-prism Volume and Its Application[J]. Geomatics and Information Science of Wuhan University, 2004, 29(7): 602-307.
  • Cited by

    Periodical cited type(17)

    1. 冉烽均,龚川. 基于OpenStreetMap数据的土地利用制图. 北京测绘. 2024(02): 238-244 .
    2. 樊潇. 以建立草原公园为抓手,推动牧区草原转型升级. 中国草食动物科学. 2022(01): 61-64 .
    3. 李霞,潘冬荣,孙斌,姜佳昌,俞慧云,王红霞,杜笑村,吴丹丹. 甘肃省草地退化概况分析——基于甘肃省第一、二次草原普查数据. 草业科学. 2022(03): 485-494 .
    4. 刘志刚,关文昊,何国兴,蒲小鹏,纪童,杨军银,李强,柳小妮. 黄河源5种高寒植物光谱特征分析及识别. 草原与草坪. 2022(04): 23-30 .
    5. 申紫雁,刘昌义,胡夏嵩,周林虎,许桐,李希来,李国荣. 黄河源区高寒草地不同深度土壤理化性质与抗剪强度关系研究. 干旱区研究. 2021(02): 392-401 .
    6. 王俊奇,王广军,梁四海,杜海波,彭红明. 1996—2015年黄河源区植被覆盖度提取和时空变化分析. 冰川冻土. 2021(02): 662-674 .
    7. 朱宁,王浩,宁晓刚,刘娅菲. 草地退化遥感监测研究进展. 测绘科学. 2021(05): 66-76 .
    8. 沈贝贝,侯路路,丁蕾,范蓓蕾,毛平平,徐大伟,闫瑞瑞,辛晓平,陈金强. 数字牧场研究进展浅析. 中国农业信息. 2021(05): 1-11 .
    9. 刘炜,孙海霞,杨晓波. 基于高光谱图像的协同分层波谱识别——以兰州、榆林地区为例. 红外与毫米波学报. 2020(01): 99-110 .
    10. 韩万强,靳瑰丽,岳永寰,王惠宁,宫珂,吴雪儿,吾鲁帕·阿得尔卡里. 伊犁绢蒿荒漠草地3种主要植物光谱及植被指数改进. 新疆农业科学. 2020(05): 950-957 .
    11. 刘炜,孙海霞,杨晓波,董建民. 对数变换、导数变换的高寒草地反射光谱特征分析与识别——以那曲地区HJ-1A/HSI图像为例. 光谱学与光谱分析. 2020(07): 2200-2207 .
    12. 董元,董梦,单莹. 基于高光谱遥感的树种识别. 华北理工大学学报(自然科学版). 2020(04): 11-16 .
    13. 付晶莹,彭婷,江东,林刚,边鹏,韩昊. 草地资源立体观测研究进展与理论框架. 资源科学. 2020(10): 1932-1943 .
    14. 苏玥. 基于遥感的草地退化研究综述. 内蒙古科技与经济. 2019(06): 53-54+56 .
    15. 查向浩,王玉洁,李有文,王超,莫治新. 草地土壤碳密度研究进展. 北方园艺. 2019(09): 159-163 .
    16. 王云艳,罗冷坤,周志刚. 改进型DeepLab的极化SAR果园分类. 中国图象图形学报. 2019(11): 2035-2044 .
    17. 张良培,刘蓉,杜博. 使用量子优化算法进行高光谱遥感影像处理综述. 武汉大学学报(信息科学版). 2018(12): 1811-1818 .

    Other cited types(9)

Catalog

    Article views (496) PDF downloads (96) Cited by(26)
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return