Abstract:
Since the coronavirus disease 2019 (COVID-19) epidemic was kept under control in China, to conduct scientific research on the patterns of the virus transmission has become essential in terms of disease control. Therefore, the demand for the precise and structured trajectory of the individual cases is increasing. While considering the highly unstructured characteristics of the spatiotemporal trajectory source string retrieved from the official website, it is difficult to obtain a precise trajectory efficiently by either hand-crafted method or an automated algorithm. To address the above contradiction of efficiency and precision in trajectory extraction, a human-computer interactive (HCI) trajectory extraction and validation approach was proposed based on natural language processing (NLP) artificial intelligence algorithm, the source string was firstly analyzed by NLP, and coarse trajectories were then identified and extracted automatically, then the trajectories were confirmed or edited by user, after that other user will validate those trajectories whether correct or not by voting. The essential technologies of the approach were also investigated, including trajectory location segmentation and combination algorithm, trajectory quality evaluation algorithm, and trajectory extraction and validation workflow. A comparative experiment that takes the Harbin native clustered cases during April as a study case was conducted to evaluate the effectiveness and practicability of the proposed approach. The results show that the efficiency of the proposed approach is significantly improved one time more than the extraction method without NLP. The evaluation results of the trajectory credibility also suggest that the HCI extraction method can effectively reduce 26.34% of missing locations and wrong positioning of the trajectory automatically extracted by NLP alone. Furthermore, the validation results also suggest that there are 92.63% trajectories were assessed to be reliable, and those incorrect trajectory nodes were mainly created by the NLP algorithm rather than the hand-crafted method. According to the experimental result, our proposed approach can improve the efficiency and quality of trajectories extraction effectively. Apart from that, our prototype system can also be used as a potential tool for epidemiological investigations to assist doctors or patients.