Abstract:
Objectives Accurately and automatically extracting buildings from high-resolution remote sensing images is of great significance in many aspects, such as urban planning, map data updating, emergency response, etc. The problems of missing and wrong detection of buildings and missing boundaries caused by spectrum confusion still exist in the existing full convolution neural networks.
Methods In order to overcome the limitations, a multi-feature fusion and object-boundary joint constraint network is proposed based on an encoding and decoding structure. In the encoding stage, the continuous-atrous spatial pyramid module is positioned at the end to extract and combine multi-scale features without sacrificing too much useful information. In the decoding stage, more accurate building features are integrated into the network and the boundary is refined by implementing the multi-output fusion constraint structure based on object and boundary. In the skip connection between the encoding and decoding stages, the convolutional block attention module is introduced to enhance the effective features. Furthermore, the multi-level output results from the decoding stage are used to build a piecewise multi-scale weighted loss function for fine network parameter updating.
Results Comparative experimental analysis is performed on the WHU and Massachusetts building datasets. The results show that the building extraction results of the proposed method are close to the ground truth. The quantitative evaluation result is higher than the other five state-of-the-art approaches. Specifically, intersection over union and F1-score on WHU and Massachusetts building datasets reach 90.44%, 94.98%, and 72.57%, 84.10%, respectively. The proposed model outperforms MFCNN and BRRNet in both complexity and efficiency.
Conclusions The proposed method not only improves the accuracy and integrity of extraction results in spectral obfuscation buildings, but also maintains a good boundary with strong robustness in scale.