Semantic Segmentation of High-Resolution Remote Sensing Images Based on Improved FuseNet Combined with Atrous Convolution

YANG Jun; YU Xizi

doi:10.13203/j.whugis20200305

Volume 47 Issue 7

Jul. 2022

Turn off MathJax

Article Contents

Abstract

References

Geomatics and Information Science of Wuhan University > 2022 > 47(7): 1071-1080. > DOI: 10.13203/j.whugis20200305

YANG Jun, YU Xizi. Semantic Segmentation of High-Resolution Remote Sensing Images Based on Improved FuseNet Combined with Atrous Convolution[J]. Geomatics and Information Science of Wuhan University, 2022, 47(7): 1071-1080. DOI: 10.13203/j.whugis20200305

Citation:

PDF (3997 KB)

Semantic Segmentation of High-Resolution Remote Sensing Images Based on Improved FuseNet Combined with Atrous Convolution

YANG Jun^{1, 2, 3, 4},
YU Xizi^{2, 3, 4}

1.
School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
2.
Faculty of Geomatics, Lanzhou Jiaotong University, Lanzhou 730070, China
3.
National-Local Joint Engineering Research Center of Technologies and Applications for National Geographic State Monitoring, Lanzhou 730070, China
4.
Gansu Provincial Engineering Laboratory for National Geographic State Monitoring, Lanzhou 730070, China

Funds:

The National Natural Science Foundation of China 61862039

Science and Technology Program of Gansu Province 20JR5RA429

2021 Central Government Funds for Guiding Local Science and Technology Development 2021-51

Excellent Platform Support Project of Lanzhou Jiaotong University 201806

More Information

Author Bio:
YANG Jun, PhD, professor, specializes in computer graphics, image processing, and geographic information system. E-mail: yangj@mail.lzjtu.cn
Received Date: September 23, 2020
Published Date: July 04, 2022

Graphical Abstract

Abstract

Abstract

Objectives With the development and popularization of deep learning theory, deep neural networks are widely used in image analysis and interpretation. The high-resolution remote sensing images have the characteristics of a large amount of information, complex data, and rich feature information, and most of the current semantic segmentation neural networks of the natural image are not completely designed for the characteristics of high-resolution remote sensing images, so it cannot effectively extract the detailed features of the ground objects in remote sensing images, and the segmentation accuracy needs to be improved.
Methods We propose the process of improved FuseNet with the atrous convolution-convolutional neural network(IFA-CNN). Firstly, we use the improved FuseNet to fuse the elevation information of DSM(digital surface model) images with the color information of RGB(red green blue) images. At the same time, we propose a multimodal data fusion scheme to solve the problem of poor fusion of the RGB branch and DSM branch. Secondly, multiscale features are captured through flexibly adjusting the receptive field by the atrous convolution. Through deconvolution and upsampling, a decoder that increases the feature maps is formed. Finally, the Softmax classifier is used to procure the semantic segmentation results.
Results Compared with relevant algorithms, IFA-CNN effectively improves the edge burr and thinning boundaries in segmented images, and is more accurate for segmentation of larger objects such as buildings and trees, it also reduces the miss segmentation condition with effect, the segmentation of the shadow covered areas is close to being perfect.The m_F₁ score achieved when our model is applied to the open ISPRS(International Society for Photogrammetry and Remote Sensing) Potsdam and Vaihingen dataset are 91.6% and 90.4% respectively, exceeding by a considerable margin of relevant algorithms.
Conclusions (1) The virtual fusion(V-Fusion) unit used for segmentation by the multimodal data fusion strategy is more accurate than the one used by the FuseNet network.(2) The encoder-decoder structure is arranged in such a way that the effective improvement of the segmentation accuracy of small target features is guaranteed. So, the loss of detailed information can be decreased. (3) While the multimodal data fusion is being carried out by IFA-CNN, the atrous convolution expands the receptive field accordingly to extract the multiscale information.
- high-resolution remote sensing image,
- deep convolutional neural network,
- atrous convolution,
- semantic segmentation,
- FuseNet

FullText(HTML)

References (24)

References

[1]	Kampffmeyer M, Salberg A B, Jenssen R. Semantic Segmentation of Small Objects and Modeling of Uncertainty in Urban Remote Sensing Images Using Deep Convolutional Neural Networks [C]//IEEE Conference on Computer Vision and Pattern Recognition Workshops, Las Vegas, NV, USA, 2016
[2]	Wang H, Wang Y, Zhang Q, et al. Gated Convolutional Neural Network for Semantic Segmentation in High-Resolution Images[J]. Remote Sensing, 2017, 9(5): 1-15
[3]	Mou Lichao, Hua Yuansheng, Zhu Xiaoxiang. A Relation-Augmented Fully Convolutional Network for Semantic Segmentation in Aerial Scenes[C]//IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA, 2019
[4]	Szegedy C, Liu W, Jia Y, et al. Going Deeper with Convolutions[C]//IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015
[5]	Hoffman J, Gupta S, Darrell T. Learning with Side Information Through Modality Hallucination[C]//IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016
[6]	Hazirbas C, Ma L, Domokos C, et al. FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-Based CNN Architecture[C]//Asian Conferen-ce on Computer Vision, Taipei, China, 2016
[7]	Long J, Shelhamer E, Darrell T. Fully Convolutional Networks for Semantic Segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 39(4): 640-651
[8]	Badrinarayanan V, Kendall A, Segnet R C. A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495 doi: 10.1109/TPAMI.2016.2644615
[9]	Sherrah J. Fully Convolutional Networks for Dense Semantic Labelling of High-Resolution Aerial Imagery[EB/OL]. (2016-06-08)[2020-06-22]. https://www.doc88.com/p-0704858988942.html
[10]	Nogueira K, Penatti O A B, Santos J A D. Towards Better Exploiting Convolutional Neural Networks for Remote Sensing Scene Classification[J]. Pattern Recognition, 2017, 61: 539-556 doi: 10.1016/j.patcog.2016.07.001
[11]	张康, 黑保琴, 周壮, 等. 变异系数降维的CNN高光谱遥感图像分类[J]. 遥感学报, 2018, 22(1): 91-100 https://www.cnki.com.cn/Article/CJFDTOTAL-YGXB201801008.htm Zhang Kang, Baoqin Hei, Zhou Zhuang, et al. CNN with Coefficient of Variation-Based Dimensionality Reduction for Hyperspectral Remote Sensing Images Classification[J]. Journal of Remote Sensing, 2018, 22(1): 91-100 https://www.cnki.com.cn/Article/CJFDTOTAL-YGXB201801008.htm
[12]	Everingham M, Eslami S M A, van Gool L, et al. The Pascal Visual Object Classes Challenge: A Retrospective[J]. International Journal of Computer Vision, 2015, 111(1): 98-136 doi: 10.1007/s11263-014-0733-5
[13]	Gerke M, Rottensteiner F, Wegner J D, et al. ISPRS Semantic Labeling Contest[J]. Remote Sensing, 2020, 12(3): 417-446 doi: 10.3390/rs12030417
[14]	Ngiam J, Khosla A, Kim, et al. Multimodal Deep Learning[C]// The 28th International Conference on Machine Learning, Washington DC, USA, 2011
[15]	Chen L C, Papandreou G, Kokkinos I, et al. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFS[J]. Computer Science, 2014, 4: 357-361
[16]	Luo W, Li Y, Urtasun R, et al. Understanding the Effective Receptive Field in Deep Convolutional Neural Networks[C]// The 30th Conference on Advances in Neural Information Processing Systems, Barcelona, Spain, 2016
[17]	Yu F, Koltun V. Multi-Scale Context Aggregation by Dilated Convolutions[C]//International Conference on Learning Representations, San Juan, Puerto Rico, 2016
[18]	Liu Y, Piramanayagam S, Monteiro S T, et al. Dense Semantic Labeling of Very-High-Resolution Aerial Imagery and LiDAR with Fully-Convolutional Neural Networks and Higher-Order CRFs[C]//IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Hawii, USA, 2017
[19]	赵俊, 郭飞霄, 李琦. PEIV模型WTLS估计的Fisher-Score算法[J]. 武汉大学学报∙信息科学版, 2019, 44(2): 214-220 doi: 10.13203/j.whugis20170061 Zhao Jun, Guo Feixiao, Li Qi. Fisher-Score Algorithm of WTLS Estimation for PEIV Model[J]. Geomatics and Information Science of Wuhan University, 2019, 44(2): 214-220 doi: 10.13203/j.whugis20170061
[20]	Chen G, Zhang X, Wang Q, et al. Symmetrical Dense-Shortcut Deep Fully Convolutional Networks for Semantic Segmentation of Very-High-Resolution Remote Sensing Images[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2018, 11(5): 1633-1644 doi: 10.1109/JSTARS.2018.2810320
[21]	Wei Y, Xiao H, Shi H, et al. Revisiting Dilated Convolution: A Simple Approach for Weakly-and Semi-supervised Semantic Segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018
[22]	Lin G S, Shen C H, van den Hengel A, et al. Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016
[23]	Paisitkriangkrai S, Sherrah J, Janney P, et al. Effective Semantic Pixel Labelling with Convolutional Networks and Conditional Random Fields[C]//IEEE Conference on Computer Vision and Pattern Recognition Workshops, Boston, MA, USA, 2015
[24]	Audebert N, Saux B L, Lefèvre S. Semantic Segmentation of Earth Observation Data Using Multimodal and Multi-Scale Deep Networks[C]//The 13th Asian Conference on Computer Vision, Taipei, China, 2016

Cited By

Get Citation

PDF

XML

Article views PDF downloads

Semantic Segmentation of High-Resolution Remote Sensing Images Based on Improved FuseNet Combined with Atrous Convolution

Abstract

References

Catalog

Related

Semantic Segmentation of High-Resolution Remote Sensing Images Based on Improved FuseNet Combined with Atrous Convolution

Abstract

References

Catalog

Related

Export File

Citation

Format

Content