Abstract:
Since there is no explicit metadata description mechanism that defines map domain themes for a Web map service (WMS), end users cannot easily discover desired a map resource in a target domain. We propose a text-based WMS domain themes extraction and metadata extension method for better supporting geographical information retrieval. Specially, we present a new unsupervised multi-label text classification algorithm that measures the semantic relevancies between feature words in a WMS capabilities document and multiple domain themes defined by the GEOSS societal benefit areas (SBAs). The semantic Web of Earth and environmental terminology (SWEET) and WordNet dictionaries are used to calculate the shortest semantic path to a certain theme for both earth terminologies and general terms. In addition, we extend WMS domain theme description by employing theme tags to the ISO19115 2003 geographic information metadata standard, flexibly and conformably. Experimental results indicate that the proposed multi-label text classification method achieves higher recall and precision ratio than other text classification methods, such as native Bayesian, linear support vector machine (SVM) logistic regression, and methods that use SWEET or WordNet alone.