ZHANG Guoyong, GONG Jianhua, SUN Jun, ZHOU Jieping, LI Wenhang, ZHANG Lihui, WANG Dongchuan, LI Wenning, HU Weidong, FAN Hongkui. An Interactive Individual Spatiotemporal Trajectory Extraction and Quality Evaluation Method for COVID-19 Cases[J]. Geomatics and Information Science of Wuhan University, 2021, 46(2): 177-183. DOI: 10.13203/j.whugis20200290
Citation: ZHANG Guoyong, GONG Jianhua, SUN Jun, ZHOU Jieping, LI Wenhang, ZHANG Lihui, WANG Dongchuan, LI Wenning, HU Weidong, FAN Hongkui. An Interactive Individual Spatiotemporal Trajectory Extraction and Quality Evaluation Method for COVID-19 Cases[J]. Geomatics and Information Science of Wuhan University, 2021, 46(2): 177-183. DOI: 10.13203/j.whugis20200290

An Interactive Individual Spatiotemporal Trajectory Extraction and Quality Evaluation Method for COVID-19 Cases

Funds: 

The Strategic Priority Research Program of Chinese Academy of Sciences XDA19090114

Jiashan Science and Technology Plan Project 2018A08

the CAS Zhejiang Institute of Advanced Technology Fund ZK-CX-2018-04

More Information
  • Author Bio:

    ZHANG Guoyong, PhD candidate, specializes in 3D geographic information system and virtual geographical environment. E-mail: zhanggy@radi.ac.cn

  • Corresponding author:

    GONG Jianhua, PhD, professor. E-mail: gongjh@radi.ac.cn

  • Received Date: August 25, 2020
  • Published Date: February 04, 2021
  • Since the coronavirus disease 2019 (COVID-19) epidemic was kept under control in China, to conduct scientific research on the patterns of the virus transmission has become essential in terms of disease control. Therefore, the demand for the precise and structured trajectory of the individual cases is increasing. While considering the highly unstructured characteristics of the spatiotemporal trajectory source string retrieved from the official website, it is difficult to obtain a precise trajectory efficiently by either hand-crafted method or an automated algorithm. To address the above contradiction of efficiency and precision in trajectory extraction, a human-computer interactive (HCI) trajectory extraction and validation approach was proposed based on natural language processing (NLP) artificial intelligence algorithm, the source string was firstly analyzed by NLP, and coarse trajectories were then identified and extracted automatically, then the trajectories were confirmed or edited by user, after that other user will validate those trajectories whether correct or not by voting. The essential technologies of the approach were also investigated, including trajectory location segmentation and combination algorithm, trajectory quality evaluation algorithm, and trajectory extraction and validation workflow. A comparative experiment that takes the Harbin native clustered cases during April as a study case was conducted to evaluate the effectiveness and practicability of the proposed approach. The results show that the efficiency of the proposed approach is significantly improved one time more than the extraction method without NLP. The evaluation results of the trajectory credibility also suggest that the HCI extraction method can effectively reduce 26.34% of missing locations and wrong positioning of the trajectory automatically extracted by NLP alone. Furthermore, the validation results also suggest that there are 92.63% trajectories were assessed to be reliable, and those incorrect trajectory nodes were mainly created by the NLP algorithm rather than the hand-crafted method. According to the experimental result, our proposed approach can improve the efficiency and quality of trajectories extraction effectively. Apart from that, our prototype system can also be used as a potential tool for epidemiological investigations to assist doctors or patients.
  • [1]
    许小可, 文成, 张光耀, 等.新冠肺炎爆发前期武汉外流人口的地理去向分布及影响[J].电子科技大学学报, 2020, 49: 1-6 https://www.cnki.com.cn/Article/CJFDTOTAL-DKDX202003002.htm

    Xu Xiaoke, Wen Cheng, Zhang Guangyao, et al. The Geographical Destination Distribution and Effect of Outflow Population of Wuhan When the Outbreak of the 2019-nCoV Pneumonia[J]. Journal of University of Electronic Science and Technology of China, 2020, 49: 1-6 https://www.cnki.com.cn/Article/CJFDTOTAL-DKDX202003002.htm
    [2]
    Merler S, Ajelli M, Fumanelli L, et al. Spatiotemporal Spread of the 2014 Outbreak of Ebola Virus Disease in Liberia and the Effectiveness of Non-pharmaceutical Interventions: A Computational Modelling Analysis[J]. The Lancet Infectious Diseases, 2015, 15(2): 204-211 doi: 10.1016/S1473-3099(14)71074-6
    [3]
    Li Z, Yin W, Clements A, et al. Spatiotemporal Analysis of Indigenous and Imported Dengue Fever Cases in Guangdong Province, China[J]. BMC Infectious Diseases, 2012, 12(1): 132 doi: 10.1186/1471-2334-12-132
    [4]
    Xu B, Gutierrez B, Mekaru S, et al. Epidemiological Data from the COVID-19 Outbreak, Real-Time Case Information[J]. Scientific Data, 2020, 7(1): 1-6 doi: 10.1038/s41597-019-0340-y
    [5]
    李德仁, 邵振峰, 于文博, 等.基于时空位置大数据的公共疫情防控服务让城市更智慧[J].武汉大学学报·信息科学版, 2020, 45(4): 475-487, 556 doi: 10.13203/j.whugis20200145

    Li Deren, Shao Zhenfeng, Yu Wenbo, et al. Public Epidemic Prevention and Control Services Based on Big Data of Spatiotemporal Location Make Cities More Smart[J]. Geomatics and Information Science of Wuhan University, 2020, 45(4): 475-487, 556 doi: 10.13203/j.whugis20200145
    [6]
    北京极海纵横信息技术有限公司. gh-2019-nCoV-community-data[OL]. https://gitee.com/geohey/gh-2019-nCoV-community-data, 2020

    GeoHey. gh-2019-nCoV-community-data[OL]. https://gitee.com/geohey/gh-2019-nCoV-community-data, 2020
    [7]
    北京航空航天大学大数据科学与脑机智能高精尖创新中心.新冠疫情确诊患者轨迹结构化数据[OL]. https://github.com/BDBC-KG-NLP/COVID-19-tracker, 2020

    Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University. COVID-19-tracker[OL]. https://github.com/BDBC-KG-NLP/COVID-19-tracker, 2020
    [8]
    Young T, Hazarika D, Poria S, et al. Recent Trends in Deep Learning Based Natural Language Processing[J]. IEEE Computational Intelligence Magazine, 2018, 13(3): 55-75 doi: 10.1109/MCI.2018.2840738
    [9]
    Corvey W J, Vieweg S, Rood T, et al. Twitter in Mass Emergency: What NLP Can Contribute[C]. NAACL HLT 2010 Workshop on Computational Linguistics in a World of Social Media, Los Angeles, California, USA, 2010
    [10]
    Qin T, Xiao R, Fang L, et al. An Efficient Location Extraction Algorithm by Leveraging Web Contextual Information[C]. The 18th ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, San Jose, CA, USA, 2010
    [11]
    Neubig G, Matsubayashi Y, Hagiwara M, et al. Safety Information Mining—What Can NLP Do in a Disaster[C]. The 5th International Joint Conference on Natural Language Processing, Chiang Mai, Thailand, 2011
    [12]
    Dhavase N, Bagade A M. Location Identification for Crime & Disaster Events by Geoparsing Twitter[C]. International Conference for Convergence for Technology, Pune, India, 2014
    [13]
    Sit M A, Koylu C, Demir I. Identifying Disaster-Related Tweets and Their Semantic, Spatial and Temporal Context Using Deep Learning, Natural Language Processing and Spatial Analysis: A Case Study of Hurricane Irma[J]. International Journal of Digital Earth, 2019, 12(11): 1205-1229 doi: 10.1080/17538947.2018.1563219
    [14]
    Wang M. Following the Spread of Zika with Social Media: The Potential of Using Twitter to Track Epidemic Disease[D]. Montreal, Quebec, Canada: Concordia University, 2017
    [15]
    Keller M, Freifeld C C, Brownstein J S. Automated Vocabulary Discovery for Geo-parsing Online Epidemic Intelligence[J]. BMC Bioinformatics, 2009, 10(1): 385 doi: 10.1186/1471-2105-10-385
    [16]
    Klein A, Magge A, O'Connor K, et al. A Chronological and Geographical Analysis of Personal Reports of COVID-19 on Twitter[J]. medRxiv, 2020, DOI: 10.1101/2020.04.19.20069948
    [17]
    Nikolajevs J, Jekabsons G. Automatic Extraction of Geographic Context from Textual Data[J]. Computational Science and Techniques, 2014, 2(1): 229-237
    [18]
    Otter D W, Medina J R, Kalita J K. A Survey of the Usages of Deep Learning for Natural Language Processing[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, DOI: 10.1109/TNNLS.2020.2979670
    [19]
    Gull K, Padhye S, Jain D S. A Comparative Analysis of Lexical/NLP Method with WEKA's Bayes Classifier[J]. International on Recent and Innovation Trends in Computing and Communication (IJRITCC), 2017, 5(2): 221-227
    [20]
    周晓光, 赵肄江, 李光强, 等.顾及信誉的众源时空数据模型[J].武汉大学学报·信息科学版, 2018, 43(1): 10-16 doi: 10.13203/j.whugis20150378

    Zhou Xiaoguang, Zhao Yijiang, Li Guangqiang, et al. Crowdsourcing Spatio-Temporal Data Model Considering Reputation[J]. Geomatics and Information Science of Wuhan University, 2018, 43(1): 10-16 doi: 10.13203/j.whugis20150378
    [21]
    Antoniou V, Skopeliti A. Measures and Indicators of VGI Quality: An Overview[J]. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2015, Ⅱ-3/W5: 345-351
    [22]
    王劲峰, 徐成东.地理探测器:原理与展望[J].地理学报, 2017, 72(1): 116-134 https://www.cnki.com.cn/Article/CJFDTOTAL-DLXB201701011.htm

    Wang Jinfeng, Xu Chengdong. Geodetector: Principle and Prospective[J]. Acta Geographica Sinica, 2017, 72(1): 116-134 https://www.cnki.com.cn/Article/CJFDTOTAL-DLXB201701011.htm
  • Cited by

    Periodical cited type(17)

    1. 冉烽均,龚川. 基于OpenStreetMap数据的土地利用制图. 北京测绘. 2024(02): 238-244 .
    2. 樊潇. 以建立草原公园为抓手,推动牧区草原转型升级. 中国草食动物科学. 2022(01): 61-64 .
    3. 李霞,潘冬荣,孙斌,姜佳昌,俞慧云,王红霞,杜笑村,吴丹丹. 甘肃省草地退化概况分析——基于甘肃省第一、二次草原普查数据. 草业科学. 2022(03): 485-494 .
    4. 刘志刚,关文昊,何国兴,蒲小鹏,纪童,杨军银,李强,柳小妮. 黄河源5种高寒植物光谱特征分析及识别. 草原与草坪. 2022(04): 23-30 .
    5. 申紫雁,刘昌义,胡夏嵩,周林虎,许桐,李希来,李国荣. 黄河源区高寒草地不同深度土壤理化性质与抗剪强度关系研究. 干旱区研究. 2021(02): 392-401 .
    6. 王俊奇,王广军,梁四海,杜海波,彭红明. 1996—2015年黄河源区植被覆盖度提取和时空变化分析. 冰川冻土. 2021(02): 662-674 .
    7. 朱宁,王浩,宁晓刚,刘娅菲. 草地退化遥感监测研究进展. 测绘科学. 2021(05): 66-76 .
    8. 沈贝贝,侯路路,丁蕾,范蓓蕾,毛平平,徐大伟,闫瑞瑞,辛晓平,陈金强. 数字牧场研究进展浅析. 中国农业信息. 2021(05): 1-11 .
    9. 刘炜,孙海霞,杨晓波. 基于高光谱图像的协同分层波谱识别——以兰州、榆林地区为例. 红外与毫米波学报. 2020(01): 99-110 .
    10. 韩万强,靳瑰丽,岳永寰,王惠宁,宫珂,吴雪儿,吾鲁帕·阿得尔卡里. 伊犁绢蒿荒漠草地3种主要植物光谱及植被指数改进. 新疆农业科学. 2020(05): 950-957 .
    11. 刘炜,孙海霞,杨晓波,董建民. 对数变换、导数变换的高寒草地反射光谱特征分析与识别——以那曲地区HJ-1A/HSI图像为例. 光谱学与光谱分析. 2020(07): 2200-2207 .
    12. 董元,董梦,单莹. 基于高光谱遥感的树种识别. 华北理工大学学报(自然科学版). 2020(04): 11-16 .
    13. 付晶莹,彭婷,江东,林刚,边鹏,韩昊. 草地资源立体观测研究进展与理论框架. 资源科学. 2020(10): 1932-1943 .
    14. 苏玥. 基于遥感的草地退化研究综述. 内蒙古科技与经济. 2019(06): 53-54+56 .
    15. 查向浩,王玉洁,李有文,王超,莫治新. 草地土壤碳密度研究进展. 北方园艺. 2019(09): 159-163 .
    16. 王云艳,罗冷坤,周志刚. 改进型DeepLab的极化SAR果园分类. 中国图象图形学报. 2019(11): 2035-2044 .
    17. 张良培,刘蓉,杜博. 使用量子优化算法进行高光谱遥感影像处理综述. 武汉大学学报(信息科学版). 2018(12): 1811-1818 .

    Other cited types(9)

Catalog

    Article views (1292) PDF downloads (123) Cited by(26)
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return