一种融合多维关系的地理环境时空主题发现方法

A Method for Geographical Environment Spatiotemporal Topic Discovery of Multi‑dimensional Relationships

  • 摘要: 对战场文本数据的深入挖掘,可以高质量和高效率地发现时空主题结构,从而有效揭示战场事件发展的时空规律。针对现有的主题发现方法无法有效适用于具有多维异构关系的时空主题发现,提出了一种融合多维关系联合聚类的时空主题发现方法,首先构建以地理环境实体、地理位置与事件主题为节点的主题关系网络;然后以张量模型的Tucker分解建立主题关系的完全表达式作为主题分类的目标函数;最后运用块值矩阵分解方法进行联合聚类计算,获取主题分类结果和内聚结构。实验结果表明,该方法能够有效发现具有时空语义关系特征的主题结构,较好地体现出地理环境要素与时空主题之间的关联性,以及时空主题在地理位置与事件主题标签上的内聚性,反映出主题的演化过程。

     

    Abstract:
    Objectives Using battlefield text data for spatiotemporal topic analysis, we can obtain the spatial distribution pattern of geographical environment elements and their impact characteristics on battlefield activities from micro to macro and from scattered to gathering places, and mine the spatial distribution and development law of battlefield events, which further enriches the perception mode of battlefield environment and provides a new means of battlefield environment efficiency analysis. It is of great significance and value for in-depth understanding of battlefield environmental knowledge.
    Methods The key technology to improve the quality of spatiotemporal topic discovery in geographical environment is to effectively construct entity composite relationship network and integrate multi-dimensional heterogeneous relationships for topic clustering. First, a spatiotemporal topic tensor model integrating multi-dimensional relationships is constructed, and the complete expression of topic relationship is given by using the Tucker decomposition of topic tensor model. Then, the feature vector space of multi-dimensional relational clustering is constructed as the objective function of topic classification, and the block value matrix decomposition technology is used for joint clustering calculation, and the core tensor matrix is used to solve the problem of data sparsity. Finally, the block value matrix obtained by multi-dimensional relational clustering is used to obtain the associ‍ation value between geographical environment elements and spatiotemporal topics.
    Results The results show that: (1) The geographical environment entities and entity relationships are correctly clustered into spatiotemporal topic structure. The accuracy rates in the training set were 88.4% and 86.9% respectively, and in the test set were 87.3% and 85.8% respectively. (2) The number of entities and tags clustered under different subject structures decreases gently with the reduction of subject scale. The statistical results show that the most subject tags are maneuver, attack and interception, and the most location tags are highlands, roads and villages. (3) Compared with latent Dirichlet allocation (LDA) algorithm, the multi-dimensional relationship joint clustering method can be seen that the number of entities and labels mined by this algorithm is generally higher than LDA algorithm, so it can be seen that the accurate and clear space-time topic structure can be obtained after integrating multi-dimensional relationships. (4) The block value matrix obtained by multi-dimensional relationship clustering reflects the internal characteristic relationship of the spatiotemporal topic structure of the geographical environment, indicating that the spatial-temporal theme has strong cohesion.
    Conclusions This method can effectively improve the quality and efficiency of spatiotemporal topic discovery, making the obtained topics better show the cohesive correlation between geographical environment elements, geographical location and event topics, providing a basis for clearly reflecting the evolution process of spatiotemporal topics, and supporting for mastering the development trend of events. Since this paper only takes the co-occurrence frequency of entity words as the weight value in the construction of relationship matrix, there is a certain deviation in data analysis. In the future, we will combine the attention mechanism to dig deep into multi-source text data, improve the efficiency and accuracy of data analysis, and establish the temporal and spatial correlation between different hot spots on the basis of effectively discovering cohesive hot spots, for inferring the event change process and providing an important reference val‍ue for the dynamic deduction of battlefield environment.

     

/

返回文章
返回