Abstract:
Spatio-temporal clustering is an important technique for mining dynamic patterns of geographical phenomena, which aims to discover groups of data so that the intra-cluster similarity is maximized and the inter-cluster similarity is minimized. Spatio-temporal clustering has been a hot topic in the field of spatio-temporal data mining and knowledge discovery. However, the performance of existing methods is seriously influenced by a series of user-specified parameters, and the significance of discovered clusters cannot be evaluated in an objective way. On that account, in this paper, a spatio-temporal clustering method based on permutation test considering both spatio-temporal proximity and attribute similarity is developed. Firstly, homogeneous sub-regions with similar attributes in the dataset are identified using the permutation testing. Then, the mean squared error criterion is adopted to group these homogeneous sub-regions into larger clusters, and a permutation testing procedure is also developed to evaluate the significance of the detected clusters. Finally, in order to improve the efficiency of the proposed method without losing the accuracy, a fast permutation testing method is developed by using the topological information among the entities. Experiments on both simulated and real-life datasets show that, on the one hand, the proposed method is effective for detecting spatio-temporal clusters of different shapes and sizes with similar thematic attributes; on the other hand, the subjectivity in clustering is significantly reduced. The proposed method is applied successfully to find spatio-temporal clusters in China's monthly average precipitation database, and the detected dynamic patterns (i.e. spatio-temporal clusters) can be helpful to investigate and interpret the developmental trends of precipitation.