Abstract:
Objectives With the rapid development of the Internet, social media has become an important information source of emergency events. However, there are a lot of duplication, errors and even malicious contents in social media, which need to be effectively classified to provide more accurate information for disaster emergency response.
Methods Deep learning has greatly improved the accuracy and efficiency of text classification. This paper takes earthquake disaster as an example, and builds a multi-label classification model based on bidirectional encoder representation from transformers (BERT) transfer learning. Over 50 000 posts about 5 earthquakes are collected as training samples from SINA Weibo, which is a very popular social media in China. Each sample is manually marked as one or more labels, such as hazards information, loss information, rescue information, public opinion information and useless information.
Results By fine-tune training, the classification accuracies of the proposed model on training dataset and test dataset reach 97% and 92%, respectively. The area under curve score of each label ranges from 0.952 to 0.998.
Conclusions The results prove that the multi-label classification using BERT transfer learning is of high reliability. The proposed model can be applied to the emergency management services for earthquake events, which is beneficial for the rapid disaster rescue and relief.