-
摘要: 基于关系型数据库的空间数据存储与处理是地理信息系统(geographic information system,GIS)领域的主流模式,但伴随着物联网、移动互联网、云计算及空间数据采集技术的发展,空间数据已从海量特征转变为大数据特征,对空间数据的存储和管理在数据量和处理模式上提出了新的挑战。首先分析了基于传统的集中式存储与管理模式在处理和应用大数据方面的局限性,包括存储对象的适应性、存储能力的可扩展性及高并发处理能力要求;然后在分析当前几大主流NoSQL数据库特点的基础上,指出了空间大数据基于NoSQL数据库的单一存储模式在数据操作方式、查询方式和数据高效管理方面存在的局限性;最后结合GIS领域空间大数据存储对数据库存储能力的可扩展性及数据处理和访问的高并发要求,提出基于内存数据库和NoSQL数据库的空间大数据分布式存储与综合处理策略,并开发了原型系统对提出的存储策略进行可行性和有效性进行了验证。Abstract: Geospatial data in databases have shifted to conform to the characteristics of big-data in tandem with the development of the Internet, mobile Internet, cloud computing, and especially, spatial data acquisition technologies. Faced with tackling spatial big data, traditional spatial database management techniques based on Relational Database Management Systems have encountered problems including the unstructured characteristics of the spatial object, the high scalability of storage capacity, and the high concurrency in big data application environment. This paper focuses on the mainstream of NoSQL databases that successfully deal with unstructured big data and are widely used in Internet applications, but lack of spatial characteristics. The data operational and query modes cannot meet the requirments of GIS applications. To resolve this problem, this paper proposes a strategy that takes a NoSQL database as a warehouse for spatial big data and a traditional spatial database as the application server. The storage system architecture and the key technology and solutions are discussed. A prototype system was developed based on MongoDB, PostgreSQL and SQLite to verify the feasibility and effectiveness of the strategy.
-
Keywords:
- spatial database /
- big data /
- NoSQL database /
- distributed storage
-
-
表 1 各数据库中空间数据存储组织方式
Table 1 Object Types Mapping in Database
SQLite:Mem PostgreSQL MongoDB 空间位置信息存储格式 Text bin BSON 数据源对应的数据库对象 database database database 图层对应的数据库对象 table table collection 空间对象对应的数据库对象 row row document 表 2 单个要素平均处理时间
Table 2 Average Time Consumption for One Shape File
耗时/s 备注 导入内存数据库 20.6 - 内存图层切分图幅 2.3 12列×3行 追加到目标图层 46.7 - 总计 69.6 - 表 3 数据提取时间记录
Table 3 Data Extraction Bases on Spatial Index
耗时/s 备注 元数据查询 0.6 - 图层内图幅查询 2.3 8个图层 图幅数据追加 978 3 861个图幅 总计 981 12 574 739对象 表 4 各图层叠加统计耗时
Table 4 Time Consumption on Overlay Analysis
图层 图幅数 对象数 耗时A/s 耗时B/s(内存数据库) L1 73 253 454 192 126 L2 385 1 264 189 643 397 L3 656 2 159 563 1072 658 L4 656 2 111 632 1123 744 L5 658 2 176 057 1070 723 L6 762 2 475 838 1119 691 L7 581 1 879 148 649 419 L8 90 254 858 215 141 -
[1] Mooney P, Corcoran P, Winstanley A C. Geospatial Data Issues in the Provision of Location-based Services[C].Proceedings of the 7th International Symposium on LBS & Telecartography, Guangzhou, China, 2010
[2] 龚健雅.空间数据库管理系统的概念与发展趋势[J].测绘科学, 2001, 26(3):4-9 http://www.cnki.com.cn/Article/CJFDTOTAL-CHKD200103001.htm Gong Jianya. Concepts and Development of Spatial Database Management System[J]. Science of Surveying and Mapping, 2001, 26(3):4-9 http://www.cnki.com.cn/Article/CJFDTOTAL-CHKD200103001.htm
[3] 刘经南, 方媛.位置大数据的分析处理研究进展[J]. 武汉大学学报·信息科学版, 2014, 39(4):380-385 http://ch.whu.edu.cn/CN/abstract/abstract2947.shtml Liu Jingnan, Fang Yuan. Research Progress in Location Big Data Analysis and Processing[J].Geomatics and Information Science of Wuhan University,2014, 39(4):380-385 http://ch.whu.edu.cn/CN/abstract/abstract2947.shtml
[4] 周芹,李绍俊.基于Oracle Spatial的空间数据库缓存技术研究[J].地球信息科学, 2007, 9(3):39-44 http://www.cnki.com.cn/Article/CJFDTOTAL-DQXX200703010.htm Zhou Qin, Li Shaojun. Study on Spatial Data Cache Technology Based on Oracle Spatial[J].Geo-Information Science, 2007, 9(3):39-44 http://www.cnki.com.cn/Article/CJFDTOTAL-DQXX200703010.htm
[5] 周芹,李绍俊,李云锦,等.空间数据库引擎的关键技术及发展[C]. 中国地理信息系统协会第四次会员代表大会, 北京,2007 Zhou Qin, Li Shaojun, Li Yunjin, et al. The Key Technique and Development of Spatial Database Engine[C]. The Fourth Member Representative Assembly of China Geographic Information System Association, Beijing, China, 2007
[6] Zhong Y, Han J, Zhang T, et al. A Distributed Geospatial Data Storage and Processing Framework for Large-scale WebGIS[C]. The 20th International Conference on Geoinformatics, Hong Kong. China, 2012
[7] Han D, Stroulia E. HGrid:A Data Model for Large Geospatial Data Sets in HBase[C]. Proceedings of the 2013 IEEE Sixth International Conference on Cloud Computing, CA. USA, 2013
[8] Wei L Y, Hsu Y T, Peng W C, et al. Indexing Spatial Data in Cloud Data Managements[J].Pervasive and Mobile Computing, 2014, 15:48-61 doi: 10.1016/j.pmcj.2013.07.001
[9] 陈崇成, 林剑峰, 吴小竹, 等.基于NoSQL的海量空间数据云存储与服务方法[J].地球信息科学学报, 2013, 15(2):166-174 doi: 10.3724/SP.J.1047.2013.00166 Chen Chongcheng, Lin Jianfeng, Wu Xiaozhu,et al. Massive Geo-spatial Data Cloud Storage and Services Based on NoSQL Database Technique[J]. Journal of Geo-Information Science, 2013, 15(2):166-174 doi: 10.3724/SP.J.1047.2013.00166
[10] Chang F, Dean J, Ghemawat S, et al. Bigtable:A Distributed Storage System for Structured Data[J].ACM Transactions on Computer Systems, 2008, 26(2):1-26 http://cn.bing.com/academic/profile?id=6034c75210d72de01b3e8b076389df33&encoded=0&v=paper_preview&mkt=zh-cn
[11] Ghemawat S, Gobioff H, Leung S T. The Google File System[C]. 19th ACM Symposium on Operating Systems Principles, New York, USA, 2003
[12] Burrows M. The Chubby Lock Service for Loosely-coupled Distributed Systems[C]. Proceedings of the 7th Symposium on Operating Systems Design and Implementation, Berkeley, USA, 2006
[13] 陈吉荣,乐嘉锦.基于Hadoop生态系统的大数据解决方案综述[J].计算机工程与科学, 2013, 35(10):25-35 http://www.cnki.com.cn/Article/CJFDTOTAL-JSJK201310004.htm Chen Jirong, Le Jiajin. Reviewing the Big Data Solution Based on Hadoop Ecosystem[J]. Computer Engineering & Science, 2013, 35(10):25-35 http://www.cnki.com.cn/Article/CJFDTOTAL-JSJK201310004.htm
[14] Hecht R, Jablonski S. NoSQL Evaluation:A Use Case Oriented Survey[C]. 2011 International Conference on Cloud and Service Computing, Hong Kong, China, 2011
[15] 李绍俊,钟耳顺,周芹,等.开放式空间数据库访问接口的开发应用[J].地球信息科学学报, 2013, 10(2):193-199 http://www.cnki.com.cn/Article/CJFDTOTAL-DQXX201302007.htm Li Shaojun, Zhong Ershun, Zhou Qin,et al. Study on Opening Geospatial Database Connectivity[J].Journal of Geo-Information Science,2013, 10(2):193-199 http://www.cnki.com.cn/Article/CJFDTOTAL-DQXX201302007.htm