Objectives The k nearest neighbor query method under the road network environment plays an important role in geographic information system, point of interest discovery, smart city, data mining and material distribution. The existing methods mainly deal with the query problem that the query object and data object are single data type. However, in reality, the data object type of the target object set is often not single, and the query problem will become more complex. The existing nearest neighbor query methods in road network cannot directly solve the neighbor query problem in which the query object is a point and the data object includes the data mixed with points and line segments. In order to make up for the shortcomings of the existing methods, the nearest neighbor query algorithm of mixed complex data in the road network is proposed.
Methods The query process is divided into three parts: Preprocessing, data set reduction and data set refining. Firstly, the data set is preprocessed, the line segments are transformed into points. A mapping relationship is formed between the intersection of line segments in the road network and line segments, and the corresponding line segments are found by querying the location of the points. The query points, data points and Hilbert values are stored using arrays. Secondly, the data object set is refined preliminarily by pruning method and shortest path theorem to reduce the number of data objects, and the data set filtering algorithm is given. Finally, the points in the data set are further pruned to calculate the shortest path network distance from the query point to the points in the candidate set. Compared with these distances, the shortest path distance can be determined, and then the nearest neighbor is accurately found.
Results Experimental results show that the proposed algorithm effectively reduces the data set and the amount of calculation in the query process, further refines the nearest neighbor candidate set and reduces the query scope and the query time. There is no need to store the distance information between all boundary nodes in the road network, so a large amount of redundant distance information is eliminated.The data points and road network data that meet the pruning conditions around the query points are pruned. Therefore, the increase of road network scale has little impact on the time cost of the algorithm.
Conclusions The proposed algorithm can directly deal with the nearest neighbor query of mixed data in the road network environment. When the number of data objects and the scale of the road network are large, the algorithm has obvious advantages in central processing unit running time and input/output cost.