ZHANG Sanqiang, SONG Guomin, JIA Fenli, CHEN Lingyu. Character Life-Track Information Model and Information Extraction Method[J]. Geomatics and Information Science of Wuhan University, 2022, 47(5): 700-706. DOI: 10.13203/j.whugis20190424
Citation: ZHANG Sanqiang, SONG Guomin, JIA Fenli, CHEN Lingyu. Character Life-Track Information Model and Information Extraction Method[J]. Geomatics and Information Science of Wuhan University, 2022, 47(5): 700-706. DOI: 10.13203/j.whugis20190424

Character Life-Track Information Model and Information Extraction Method

Funds: 

The National Key Research and Development Program of China 2017YFB0503500

the National Natural Science Foundation of China 41671407

the National Natural Science Foundation of China 41701457

the National Natural Science Foundation of China 41801317

More Information
  • Author Bio:

    ZHANG Sanqiang, master, specializes in the operational environment data engineering. E-mail: 1390724098@qq.com

  • Corresponding author:

    SONG Guomin, PhD, professor. E-mail: ccllyy123456@163.com

  • Received Date: May 06, 2020
  • Published Date: May 04, 2022
  •   Objectives  In the field of human-related geographic information systems (GIS), the spatiotemporal analysis of character information has received increasingly more attention. It is important in that it helps GIS users to generate various thematic maps and achieve the visualization of human geographic content. For adaptation to the development direction of GIS intellectualization, it is of great significance to combine GIS requirements with natural language processing (NLP) methods and build a character information model.
      Methods  Firstly, we expound the research status of character information models in GIS and NLP and put forward the concept of life-track, which is mainly composed of a series of character event mentions. Secondly, considering the feasibility of the existing information extraction methods, a conceptual character life-track information model is determined. This model focuses on event information to highlight character spatiotemporal elements and also includes character attribute and relationship information. Finally, a complete information extraction process is designed for the model with online character encyclopedia pages as the data source. This paper focuses on two sub-tasks in the process: One is to use time features and OpenHowNet semantic calculations to identify event mentions, and the other is to use linguistics features and the conditional random field (CRF) model to extract location information.
      Results  Experiment results show that the method of event mention identification has an accuracy of 91.8%. Although the average F1 value of location information extraction is only 78% under the condition of a limited labeling corpus, some valuable experimental conclusions have been obtained by analyzing the weight of the transmit matrix of the CRF mod‍el: (1) The location phrase and its adjacent words have obvious characteristic effects. (2) ‍The dependency syntactic parsing and the relative position of the word in the sentence have little influence on the extraction. (3) The target of location information extraction is the place where the event occurred, but in a few cases, some location phrases are irrelevant to the location of the event. This is the main reason for the low accuracy.
      Conclusions  Combining GIS with NLP, intelligent GIS development will be prom‍is‍ing. The character life-track information model provides an example of the large-scale use of ubiquitous internet information. Improving methods applied in the extraction process and applying those methods to more online text types are the focus of our team's subsequent research.
  • [1]
    林珲, 张捷, 杨萍, 等. 空间综合人文学与社会科学研究进展[J]. 地球信息科学, 2006, 8 (2): 30-37 https://www.cnki.com.cn/Article/CJFDTOTAL-DQXX200602006.htm

    Lin Hui, Zhang Jie, Yang Ping, et al. Development on Spatially Integrated Humanities and Social Science[J]. Geo-Information Science, 2006, 8(2): 30-37 https://www.cnki.com.cn/Article/CJFDTOTAL-DQXX200602006.htm
    [2]
    李凡. GIS在历史、文化地理学研究中的应用及展望[J]. 地理与地理信息科学, 2008, 24(1): 21-26 https://www.cnki.com.cn/Article/CJFDTOTAL-DLGT200801007.htm

    Li Fan. Application and Perspective of GIS in Research on Historical Geography and Cultural Geography[J]. Geography and Geo-Information Science, 2008, 24(1): 21-26 https://www.cnki.com.cn/Article/CJFDTOTAL-DLGT200801007.htm
    [3]
    Filatova E, Prager J. Tell me What You do and I'll Tell You What You Are: Learning Occupation-Related Activities for Biographies[C]// Human Language Technology and Empirical Methods in Natural Language Processing, Vancouver, British Colum bia, Canada, 2005
    [4]
    Han Y J, Park S Y, Park S B, et al. Reconstruction of People Information Based on an Event Ontology [C]// International Conference on Natural Language Processing and Knowledge Engineering, Beijing, China, 2007
    [5]
    于满泉. 面向人物追踪的知识挖掘研究[D]. 北京: 中国科学院研究生院, 2006

    Yu Manquan. Research on Knowledge Mining in Person Tracking[D]. Beijing: University of Chinese Academy of Sciences, 2006
    [6]
    温永宁, 闾国年, 陈旻, 等. 华夏家谱GIS的数据组织与系统架构[J]. 地球信息科学学报, 2010, 12(2): 2235-2241 https://www.cnki.com.cn/Article/CJFDTOTAL-DQXX201002014.htm

    Wen Yongning, Lü Guonian, Chen Min, et al. Data Organization and System Architecture of SinoFamily Tree GIS[J]. Journal of Geo-Information Science, 2010, 12(2): 2235-2241 https://www.cnki.com.cn/Article/CJFDTOTAL-DQXX201002014.htm
    [7]
    周丙锋, 周文业, 赵文吉. 中国历史地理数字化应用平台研究[J]. 测绘科学, 2008, 33(4): 199-202 doi: 10.3771/j.issn.1009-2307.2008.04.070

    Zhou Bingfeng, Zhou Wenye, Zhao Wenji. Study on Digital Application Platform of Historical Geography[J]. Science of Surveying and Mapping, 2008, 33(4): 199-202 doi: 10.3771/j.issn.1009-2307.2008.04.070
    [8]
    胡迪, 闾国年, 江南, 等. 地理与历史双重视角下的历史GIS数据模型[J]. 地球信息科学学报, 2018, 20(6): 713-720 https://www.cnki.com.cn/Article/CJFDTOTAL-DQXX201806002.htm

    Hu Di, Lü Guonian, Jiang Nan, et al. Historical GIS Data Model Under Geographic and Historical Perspectives[J]. Journal of Geo-Information Science, 2018, 20(6): 713-720 https://www.cnki.com.cn/Article/CJFDTOTAL-DQXX201806002.htm
    [9]
    李凯, 王艳军. 基于WebGIS的历史人文地理信息系统设计与实现[J]. 地理空间信息, 2019, 17(3): 59-61 https://www.cnki.com.cn/Article/CJFDTOTAL-DXKJ201903019.htm

    Li Kai, Wang Yanjun. Design and Realization of Historical Human Geographical Information System Based on WebGIS[J]. Geospatial Information, 2019, 17(3): 59-61 https://www.cnki.com.cn/Article/CJFDTOTAL-DXKJ201903019.htm
    [10]
    赵锐. 基于人物角色事件的传记生成方法研究[D]. 大连: 大连理工大学, 2015

    Zhao Rui. Research on Biography Generation Based on Events of Character Roles[D]. Dalian: Dalian University of Technology, 2015
    [11]
    王双. 时空叙事可视化理论与方法研究[D]. 郑州: 信息工程大学, 2017

    Wang Shuang. Research on Theories and Methods of Spatial-Temporal Narrative Visualization[D]. Zhengzhou: Information Engineering University, 2017
    [12]
    金博, 史彦军, 滕弘飞. 基于语义理解的文本相似度算法[J]. 大连理工大学学报, 2005, 45(2): 291-297 doi: 10.3321/j.issn:1000-8608.2005.02.028

    Jin Bo, Shi Yanjun, Teng Hongfei. Similarity Algorithm of Text Based on Semantic Understanding[J]. Journal of Dalian University of Technology, 2005, 45(2): 291-297 doi: 10.3321/j.issn:1000-8608.2005.02.028
    [13]
    Vikas Y, Steven B. A Survey on Recent Advances in Named Entity Recognition from Deep Learning Models[C]// The 26th International Conference on Computational Linguistics, Santa Fe, USA, 2018
    [14]
    张祝玉, 任飞亮, 朱靖波. 基于条件随机场的中文命名实体识别特征比较研究[C]//第四届全国信息检索与内容安全学术会议, 北京, 2008

    Zhang Zhuyu, Ren Feiliang, Zhu Jingbo. A Comparative Study of Features on CRF-Based Chinese Named Entity Recognition[C]// The 4th China National Conference on Information Retrieval and Content Security, Beijing, China, 2008
    [15]
    邬伦, 刘磊, 李浩然, 等. 基于条件随机场的中文地名识别方法[J]. 武汉大学学报·信息科学版, 2017, 42(2): 150-156 doi: 10.13203/j.whugis20141009

    Wu Lun, Liu Lei, Li Haoran, et al. A Chinese Toponym Recognition Method Based on Conditional Random Field[J]. Geomatics and Information Science of Wuhan University, 2017, 42(2): 150 156 doi: 10.13203/j.whugis20141009
    [16]
    魏勇, 李鸿飞, 胡丹露, 等. 一种基于复合特征的中文地名识别方法[J]. 武汉大学学报·信息科学版, 2018, 43(1): 17-23 doi: 10.13203/j.whugis20150538

    Wei Yong, Li Hongfei, Hu Danlu, et al. A Method of Chinese Place Name Recognition Based on Com posite Features[J]. Geomatics and Information Science of Wuhan University, 2018, 43(1): 17-23 doi: 10.13203/j.whugis20150538
  • Cited by

    Periodical cited type(0)

    Other cited types(2)

Catalog

    Article views (684) PDF downloads (88) Cited by(2)
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return