Abstract:
Objectives: In recent years, extreme weather events have increased and sudden disasters have occurred frequently, which puts forward higher requirements for disaster emergency response. Once a disaster happened, information collection is the key to decision-making of response. With the rapid development of the Internet, social media platform has become an important source of emergency disaster information. However, social media platforms have a lot of duplication, errors and even malicious content in a short time. Social media content needs to be effectively screened through technical means to provide basis for accurate disaster emergency response.
Methods: The development of deep learning greatly boosts the accuracy and the efficiency of text task. This study took earthquake disasters as an example, over 50K microblog data in the 72 hours after 5 major earthquakes in China during 2013-2022 were obtained. A multi-label classification model was built by transfer learning based on BERT pre-trained model. Each sample was manually marked as one or more of five types of labels: hazards information, loss information, rescue information, public opinion information and useless information.
Results: By fine-tune training, the classification accuracy of the model on the training set and the test set reached 95% and 91%, respectively. Single-label AUC score ranged from 0.952 to 0.998.
Conclusions: Both metrics proved the model is of high reliability. The model can be applied to the emergency management in sudden disaster events, which is conducive to rapidly assisting disaster judgment.