一种基于标签云的位置关联文本信息可视化方法

文章信息

华一新, 李响, 赵军喜, 王丽娜, 张晶

HUA Yixin, LI Xiang, ZHAO Junxi, WANG Lina, ZHANG Jing

A Tag Cloud-based Visualization for Geotagged Text Information

武汉大学学报·信息科学版, 2015, 40(8): 1080-1087

Geomatics and Information Science of Wuhan University, 2015, 40(8): 1080-1087

http://dx.doi.org/10.13203/j.whugis20130794

文章历史

收稿日期: 2013-12-17

Abstract

PDF

Figures

Tables

华一新, 李响, 赵军喜, 王丽娜, 张晶. 一种基于标签云的位置关联文本信息可视化方法[J]. 武汉大学学报·信息科学版, 2015, 40(8): 1080-1087. 复制到剪切板

HUA Yixin, LI Xiang, ZHAO Junxi, WANG Lina, ZHANG Jing. A Tag Cloud-based Visualization for Geotagged Text Information[J]. Geomatics and Information Science of Wuhan University, 2015, 40(8): 1080-1087. 复制到剪切板

一种基于标签云的位置关联文本信息可视化方法

华一新, 李响 , 赵军喜, 王丽娜, 张晶

信息工程大学地理空间信息学院, 河南郑州, 450052

收稿日期: 2013-12-17

项目来源: 国家自然科学基金资助项目(41401467,41271450,41371383);国家科技支撑计划项目(2012BAK12B02)。

第一作者: 华一新,教授。主要从事地理信息系统应用相关工作。E-mail:13607680789@163.com

通讯作者:李响,博士。E-mail:Helloj2ee@126.com

摘要: 互联网的广泛应用产生了越来越多与地理空间位置关联的文本信息。现有地理信息系统一般通过外部链接来浏览这些数据,需要频繁的缩放、漫游和点击操作,而其他方法又难以有效表达出空间位置关系。提出了一种基于标签云的位置关联文本信息可视化方法——标签云地图,给出了标签云地图的设计思路和实现流程,并以腾讯微博的真实数据集为例建立了原型,重点研究了点状和面状地理要素的Cartogram生成算法,关键字和词频的提取算法,面向不同尺度和不同时间的标签云显示规则的标签位置生成算法。实验表明,该方法能够帮助用户从大量的位置关联文本信息中快速感知并把握信息的总体特征和发展趋势。

关键词: 标签云标签云地图文本信息可视化位置关联可视化

A Tag Cloud-based Visualization for Geotagged Text Information

HUA Yixin, LI Xiang , ZHAO Junxi, WANG Lina, ZHANG Jing

Institute of Surveying and Mapping, Information Engineering University, Zhengzhou 450052, China

First author:HUA Yixin,PhD,professor,specializes in application of geographic information system.E-mail:13607680789@163.com

Corresponding author:LI Xiang,PhD.E-mail:wln_map@126.com

Foundation support: :The National Natural Science Foundation of China,Nos.41401467,41271450,41371383;the National Science & Technology Support Program of China,No.2012BAK12B02

Abstract: With the wide application of Internet,there is more and more geotagged text information. Nevertheless, such data is currently browsed in Geographical information systems through external links,requiring frequent zooming, panning, and click operations. Meanwhile, other methods can not effectively express spatial relationships. A tag cloud mapping method-a tag cloud-based visualization for geotagged text information is presented in this paper. We provide an overall implementation process for the method and design and implement a prototype based on the real-life data sets of Tencent Microblog. We focus on designing cartogram algorithms for point and polygon features, the extraction of keywords and frequency and the display rule models for different scales and different times in the label placement algorithms. This method can help users rapidly make sense of the important concepts and grasp main features and developing trends in a large amount of geotagged text information.

Key words: tag cloud tag cloud map text information visualization geotagged visualization

现实生活中存在大量与地理位置或者区域相关联的文本信息，比如一个地方的地志，某个地点发生的新闻，以及网络上与地理位置关联的微博、微信等信息。这些信息被称为位置关联(英文中一般称之为“geo\|referenced”或者“geotagged”)文本信息。随着众包、自发地理信息等概念的流行^[1]，出现了大量与地理位置关联的文本信息。如何进行有效的可视化，为信息的探索和发现提供有力支持成为需要解决的问题。

传统的位置关联文本信息可视化方法主要有以地理信息为主的可视化方法和以文本信息为主的可视化方法。

传统的地理信息系统软件(如ArcGIS、SuperMap等)根据不同类型的文本信息进行可视化。结构化的文本信息作为地理要素的属性信息存储在关系表中，点击某一个地理要素时，与之关联的文本信息会以数据表格的形式呈现出来。对于非结构化的文本信息，则采用一种外部链接的方法，即该地理区域保存了所有与之关联的文本存储位置，当点击该区域时，由相应的外部程序(如记事本、浏览器等)或者内嵌在地理信息系统软件中的插件(Office，Adobe插件)打开文本。

这种可视化方法非常适合表达传统的“几何+属性”结构的地理信息，但是对于互联网中大量非结构化的位置关联文本信息，用户难以从这种可视化的形式中发现和探索有用信息。

大百科全书软件(如微软的Encarta、维基百科以及百度百科等)，采用的是与地理信息系统软件截然不同的思路，以文字为主体，文字所关联的地理空间位置则由偏安一隅的地图来表示，如图 1所示。这种表示方法过于侧重表达文本信息，空间信息的表达过于简略。

图 1 以文本信息为主的可视化方法 Fig. 1 The Visualization Method for Text Information

模型	最大级别	最小级别	最大词频数	标签大小/px
模型一	1∶19百万	1∶4.75百万	3	60
模型二	1∶4.75百万	1∶2.38百万	12	120
模型三	1∶2.38百万	1∶0.59百万	24	180
	1∶0.59百万	1∶0.29百万	3	60
模型四	1∶0.29百万	1∶0.10百万	12	120
	1∶0.10百万		24	180

[1]	Li Deren, Qian Xinlin. A Brief Information of Data Management for Volunteered Geographic Information[J].Geomatics and Information Science of Wuhan University,2010, 25(4):379-383(李德仁, 钱新林. 浅论自发地理信息的数据管理 [J]. 武汉大学学报·信息科学版, 2010, 25(4): 379-383)

[2]	Coupland D. Microserfs: A Novel [M]. New York: Harper Collins, 2011

[3]	Milgram S. Psychological Maps of Paris [J]. Environmental Psychology: People and Their Physical Settings, 1976: 104-124

[4]	Jaffe A, Naaman M, Tassa T, et al. Generating Summaries and Visualization for Large Collections of Geo-referenced Photographs [C]. The 8th ACM International Workshop on Multimedia Information Retrieval, Santa Barbara, California, USA,2006

[5]	Slingsby A, Dykes J, Wood J, et al. Interactive Tag Maps and Tag Clouds for the Multiscale Exploration of Large Spatio-temporal Datasets[C]. The Information Visualization, 11th International Conference, California, USA, 2007

[6]	Wood J, Dykes J, Slingsby A, et al. Interactive Visual Exploration of a Large Spatio-temporal Dataset: Reflections on a Geovisualization Mashup [J]. Visualization and Computer Graphics, IEEE Transactions on, 2007, 13(6): 1 176-1 183

[7]	Maceachren A, Stryker M, Turton I, et al. Health GeoJunction: Place-time-concept Browsing of Health Publications [J]. International Journal of Health Geographics, 2010, 9(1): 23-26

[8]	Elmer M. Laconic History of Our World Map[OL]. http://maphugger.com/post/38323044556/laconic-history-of-the-world-2012-my-first,2012.

[9]	Dinh-Quyen N, Schumann H. Taggram: Exploring Geo-data on Maps through a Tag Cloud-Based Visualization [C]. The 14th Information Visualisation (IV) International Conference, London, 2010

[10]	Tobler W. Thirty Five Years of Computer Cartograms [J]. Annals of the Association of American Geographers, 2004, 94(1): 58-73

[11]	Dorling D. Area Cartograms: Their Use and Creation[OL]. http://www.dannydorling.org/wp-content/files/dannydorling_publication_id1448.pdf,2013

[12]	Tobler W R. Pseudo-Cartograms [J]. Cartography and Geographic Information Science, 1986, 13(1): 43-50

[13]	Tversky B. Distortions in Memory for Maps [J]. Cognitive Psychology, 1981, 13(3): 407-433

[14]	Hyungeun J. Placegram: A Diagrammatic Map for Personal Geotagged Data Browsing [J]. IEEE Transactions on Visualization and Computer Graphics, 2010, 16(2): 221-234

[15]	Lee B, Riche N H, Karlson A K, et al. SparkClouds: Visualizing Trends in Tag Clouds [J]. IEEE Transactions on Visualization and Computer Graphics, 2010, 16(6): 1 182-1 189