基于NoSQL数据库的空间大数据分布式存储策略

Geo-spatial Big Data Storage Based on NoSQL Database

  • 摘要: 基于关系型数据库的空间数据存储与处理是地理信息系统(geographic information system,GIS)领域的主流模式,但伴随着物联网、移动互联网、云计算及空间数据采集技术的发展,空间数据已从海量特征转变为大数据特征,对空间数据的存储和管理在数据量和处理模式上提出了新的挑战。首先分析了基于传统的集中式存储与管理模式在处理和应用大数据方面的局限性,包括存储对象的适应性、存储能力的可扩展性及高并发处理能力要求;然后在分析当前几大主流NoSQL数据库特点的基础上,指出了空间大数据基于NoSQL数据库的单一存储模式在数据操作方式、查询方式和数据高效管理方面存在的局限性;最后结合GIS领域空间大数据存储对数据库存储能力的可扩展性及数据处理和访问的高并发要求,提出基于内存数据库和NoSQL数据库的空间大数据分布式存储与综合处理策略,并开发了原型系统对提出的存储策略进行可行性和有效性进行了验证。

     

    Abstract: Geospatial data in databases have shifted to conform to the characteristics of big-data in tandem with the development of the Internet, mobile Internet, cloud computing, and especially, spatial data acquisition technologies. Faced with tackling spatial big data, traditional spatial database management techniques based on Relational Database Management Systems have encountered problems including the unstructured characteristics of the spatial object, the high scalability of storage capacity, and the high concurrency in big data application environment. This paper focuses on the mainstream of NoSQL databases that successfully deal with unstructured big data and are widely used in Internet applications, but lack of spatial characteristics. The data operational and query modes cannot meet the requirments of GIS applications. To resolve this problem, this paper proposes a strategy that takes a NoSQL database as a warehouse for spatial big data and a traditional spatial database as the application server. The storage system architecture and the key technology and solutions are discussed. A prototype system was developed based on MongoDB, PostgreSQL and SQLite to verify the feasibility and effectiveness of the strategy.

     

/

返回文章
返回